mlnext.pipeline.DateExtractor#

class mlnext.pipeline.DateExtractor(*, date_column: str, start_date: date, end_date: date, invert: bool = False, verbose: bool = False)[source]#

Bases: BaseEstimator, TransformerMixin

Drops rows that are not between a start and end date. Limits are inclusive.

Example

>>> data = pd.DataFrame(
        {'dates': [datetime.datetime(2021, 7, 1, 9, 50, 0),
                datetime.datetime(2021, 7, 2, 11, 0, 0),
                datetime.datetime(2021, 7, 3, 12, 10, 0)],
        'values': [0, 1, 2]})
>>> DateExtractor(date_column='dates',
                  start_date=datetime.date(2021, 7, 2),
                  end_date=datetime.date(2021, 7, 2)).transform(data)
pd.DataFrame({'dates': datetime.datetime(2021, 07, 2, 11, 0, 0),
                'values': [1]})

Methods

`fit`
`fit_transform`	Fit to data, then transform it.
`get_params`	Get parameters for this estimator.
`set_output`	Set output container.
`set_params`	Set the parameters of this estimator.
`transform`	Drops rows which date is not between start and end date.

fit_transform(X, y=None, **fit_params)#

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:

X (array-like of shape (n_samples, n_features)) – Input samples.
y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).
**fit_params (dict) – Additional fit parameters.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

get_params(deep=True)#

Get parameters for this estimator.

Parameters:: deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.
Returns:: params – Parameter names mapped to their values.
Return type:: dict

set_output(*, transform=None)#

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters:

transform ({"default", "pandas"}, default=None) –

Configure output of transform and fit_transform.

”default”: Default output format of a transformer
”pandas”: DataFrame output
None: Transform configuration is unchanged

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**params)#

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:: **params (dict) – Estimator parameters.
Returns:: self – Estimator instance.
Return type:: estimator instance

transform(X)[source]#

Drops rows which date is not between start and end date. Bounds are inclusive. Dataframe is reindexed.

Parameters:: X (pd.Dataframe) – Dataframe.
Returns:: Returns the new dataframe.
Return type:: pd.Dataframe