mlnext.pipeline.ZeroVarianceDropper#

class mlnext.pipeline.ZeroVarianceDropper(verbose: bool = False)[source]#

Bases: BaseEstimator, TransformerMixin

Removes all columns that are numeric and have zero variance. Needs to be fitted first. Gives a warning if a column that was registered as zero variance deviates.

Example

>>> data = pd.DataFrame({'a': [0.0, 0.0], 'b': [1.0, 0.0]})
>>> ZeroVarianceDropper().fit_transform(data)
pd.DataFrame({'b': [1.0, 0.0]})

Methods

fit

Finds all columns with zero variance.

fit_transform

Fit to data, then transform it.

get_params

Get parameters for this estimator.

set_output

Set output container.

set_params

Set the parameters of this estimator.

transform

Drops all columns found by fit with zero variance.

fit(X, y=None)[source]#

Finds all columns with zero variance.

Parameters:
  • X (pd.DataFrame) – Dataframe.

  • y (array-like, optional) – Labels. Defaults to None.

Returns:

Returns self.

Return type:

ZeroVarianceDropper

fit_transform(X, y=None, **fit_params)#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

get_params(deep=True)#

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_output(*, transform=None)#

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters:

transform ({"default", "pandas"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • None: Transform configuration is unchanged

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

transform(X)[source]#

Drops all columns found by fit with zero variance.

Parameters:

X (pd.DataFrame) – Dataframe.

Returns:

Returns the new dataframe.

Return type:

pd.DataFrame