mlnext.pipeline.RelativeTimeEncoder

class mlnext.pipeline.RelativeTimeEncoder(timestamp_column: str, inplace: bool = True, output_name: str | None = None, offset: int = 0, unit: Literal['d', 'h', 'min', 's', 'ms'] = 'ms')[source]

Bases: BaseEstimator, TransformerMixin

Calculates the relative time based on a timestamp_column.

Parameters:
  • timestamp_column (str) – Name of the timestamp column.

  • inplace (bool) – Whether to perform the operation inplace and replace the timestamp_column with the relative time.

  • output_name (str) – Name of the output column. Inplace must be set to False. If inplace is False and output_name is None, then the new column is the timestamp column with _relative as a suffix.

  • offset (int) – Offset added to the relative time.

  • unit (int) – Unit of the time difference.

Added in version 0.6.1.

Example

>>> import pandas as pd
>>> from mlnext import RelativeTimeEncoder
>>> data = pd.DataFrame({'time': pd.date_range('')})
>>> encoder = pipeline.RelativeTimeEncoder(
>>>     timestamp_column='time',
>>>     inplace=False,
>>>     output_name='time_r',
>>>     offset=offset,
>>>     unit=unit,
>>> )
>>> data = pd.DataFrame(
>>>     {
>>>         "time": pd.date_range(
>>>             "2024-10-01 10:00:00",
>>>             freq=f"2ms",
>>>             periods=5,
>>>         )
>>>     }
>>> )
>>> encoder.fit_transform(data)
    time                        time_r
0       2024-10-01 10:00:00.000 0.100
1       2024-10-01 10:00:00.002 0.102
2       2024-10-01 10:00:00.004 0.104
3       2024-10-01 10:00:00.006 0.106
4       2024-10-01 10:00:00.008 0.108

Methods

fit

fit_transform

Fit to data, then transform it.

get_metadata_routing

Get metadata routing of this object.

get_params

Get parameters for this estimator.

set_output

Set output container.

set_params

Set the parameters of this estimator.

transform

Calculates the relative time for a timestamp column.

fit_transform(X, y=None, **fit_params)

Fit to data, then transform it.

Fits transformer to X and y with optional parameters fit_params and returns a transformed version of X.

Parameters:
  • X (array-like of shape (n_samples, n_features)) – Input samples.

  • y (array-like of shape (n_samples,) or (n_samples, n_outputs), default=None) – Target values (None for unsupervised transformations).

  • **fit_params (dict) – Additional fit parameters.

Returns:

X_new – Transformed array.

Return type:

ndarray array of shape (n_samples, n_features_new)

get_metadata_routing()

Get metadata routing of this object.

Please check User Guide on how the routing mechanism works.

Returns:

routing – A MetadataRequest encapsulating routing information.

Return type:

MetadataRequest

get_params(deep=True)

Get parameters for this estimator.

Parameters:

deep (bool, default=True) – If True, will return the parameters for this estimator and contained subobjects that are estimators.

Returns:

params – Parameter names mapped to their values.

Return type:

dict

set_output(*, transform=None)

Set output container.

See sphx_glr_auto_examples_miscellaneous_plot_set_output.py for an example on how to use the API.

Parameters:

transform ({"default", "pandas", "polars"}, default=None) –

Configure output of transform and fit_transform.

  • ”default”: Default output format of a transformer

  • ”pandas”: DataFrame output

  • ”polars”: Polars output

  • None: Transform configuration is unchanged

Added in version 1.4: “polars” option was added.

Returns:

self – Estimator instance.

Return type:

estimator instance

set_params(**params)

Set the parameters of this estimator.

The method works on simple estimators as well as on nested objects (such as Pipeline). The latter have parameters of the form <component>__<parameter> so that it’s possible to update each component of a nested object.

Parameters:

**params (dict) – Estimator parameters.

Returns:

self – Estimator instance.

Return type:

estimator instance

transform(X: DataFrame) DataFrame[source]

Calculates the relative time for a timestamp column.

Parameters:

X (pd.DataFrame) – Input.

Raises:

ValueError – Raised if the timestamp column was not found.

Returns:

Returns the new dataframe.

Return type:

pd.DataFrame