Preprocessing¶
This package is used to manipulate Dataset.
Base classes¶
- class sekupy.preprocessing.base.PreprocessingPipeline(name='pipeline', nodes=None, nodes_kwargs=None)[source]¶
Pipeline for chaining multiple preprocessing transformers.
This class allows combining multiple preprocessing steps into a single pipeline that can be applied to datasets sequentially.
- Parameters:
- add(node)[source]¶
Add a transformer node to the pipeline.
- Parameters:
node (Transformer) – The transformer node to add to the pipeline
- Returns:
Self, for method chaining
- Return type:
- transform(ds)[source]¶
Transform the dataset through all nodes in the pipeline.
This method applies each transformer in the pipeline sequentially to the dataset.
- Parameters:
ds (Dataset) – The dataset to transform
- Returns:
The transformed dataset after applying all pipeline nodes
- Return type:
Dataset
- class sekupy.preprocessing.base.Transformer(name='transformer', **kwargs)[source]¶
Base class for data transformation components.
Transformers are used to preprocess datasets in the sekupy framework. They inherit from Node and provide functionality to transform datasets while tracking the applied transformations.
- Parameters:
- map_transformer(ds)[source]¶
Map the transformer to the dataset’s preprocessing history.
This method records the transformer configuration in the dataset’s preprocessing attribute for reproducibility.
- Parameters:
ds (Dataset) – The dataset to which the transformer mapping is applied
Transformers¶
Balancing transformers : balance samples in the Dataset.
Connectivity transformers : transform dataset for connectivity analyses
Filtering in time : time filtering.
Mapper : list of all
TransformersMath Transformer : transformation based on mathematical formulas (e.g. fisher transformation)
Memory savers : for memory management
Normalizer : normalize features or samples.
Pipelines : used to concatenate different
TransformersRegression : transform based on regression and orthogonalization
Scikit-Learn Wrapper : wrapper for sklearn package
Slicers : used to slice the dataset based on attributes