Balancing transformers

This set of Transformer s are used to make the dataset balanced, same number of samples per class. It is based on imbalanced-learn .

sekupy.preprocessing.balancing.base module

class sekupy.preprocessing.balancing.base.Balancer(balancer=RandomUnderSampler(), attr='chunks', **kwargs)[source]

This class is used to transform an unblanced dataset in a balanced dataset.

Parameters:
  • balancer (BaseSampler, optional) – [description] (the default is RandomUnderSampler which [default_description])

  • attr (str, optional) – [description] (the default is ‘chunks’, which [default_description])

  • force_balance (boolean) – [description]

transform(ds)[source]

Transform the provided dataset.

This method applies the transformation to the dataset and records the transformation in the dataset’s preprocessing history.

Parameters:

ds (Dataset) – The dataset to transform

Returns:

The transformed dataset

Return type:

Dataset

class sekupy.preprocessing.balancing.base.OverSamplingBalancer(balancer, attr='chunks', **kwargs)[source]
class sekupy.preprocessing.balancing.base.SamplingBalancer(balancer, attr='chunks', name='balancer', **kwargs)[source]
transform(ds)[source]

Transform the provided dataset.

This method applies the transformation to the dataset and records the transformation in the dataset’s preprocessing history.

Parameters:

ds (Dataset) – The dataset to transform

Returns:

The transformed dataset

Return type:

Dataset

class sekupy.preprocessing.balancing.base.UnderSamplingBalancer(balancer, attr='chunks', **kwargs)[source]

sekupy.preprocessing.balancing.imbalancer module

class sekupy.preprocessing.balancing.imbalancer.Imbalancer(sampling_strategy=0.75, attr=None, **kwargs)[source]
get_ratio(y)[source]
transform(ds)[source]

Transform the provided dataset.

This method applies the transformation to the dataset and records the transformation in the dataset’s preprocessing history.

Parameters:

ds (Dataset) – The dataset to transform

Returns:

The transformed dataset

Return type:

Dataset

sekupy.preprocessing.balancing.utils module

sekupy.preprocessing.balancing.utils.chunk_generator(values, difference, new_targets)[source]
sekupy.preprocessing.balancing.utils.default_generator(values, difference, new_targets)[source]
sekupy.preprocessing.balancing.utils.event_generator(values, difference, new_targets)[source]
sekupy.preprocessing.balancing.utils.sample_generator(key, values, difference, new_targets)[source]
sekupy.preprocessing.balancing.utils.target_generator(values, difference, new_targets)[source]
sekupy.preprocessing.balancing.utils.time_generator(values, difference, new_targets)[source]