RobustScaler#

class mvpy.preprocessing.RobustScaler(with_centering: bool = True, with_scaling: bool = True, quantile_range: tuple[float, float] = (25.0, 75.0), dims: list | tuple | int | None = None)[source]#

Implements a robust scaler that is invariant to outliers.

By default, this scaler removes the median before scaling the data according to the interquartile range \([0.25, 0.75]\). This is useful because, unlike Scaler, it means that RobustScaler is robust to outliers that might affect a Scaler poorly.

Both centering and scaling are optional and can be turned on or off using with_centering and with_scaling.

Parameters:
with_centeringbool, default=True

If True, center the data before scaling.

with_scalingbool, default=True

If True, scale the data according to the quantiles.

quantile_rangetuple[float, float], default=(25.0, 75.0)

Tuple describing the quantiles.

dimsint, list or tuple of ints, default=None

The dimensions over which to scale (None for first dimension).

Attributes:
with_centeringbool, default=True

If True, center the data before scaling.

with_scalingbool, default=True

If True, scale the data according to the quantiles.

quantile_rangetuple[float, float], default=(25.0, 75.0)

Tuple describing the quantiles.

dimsint, list or tuple of ints, default=None

The dimensions over which to scale (None for first dimension).

dims_tuple[int], default=None

Tuple specifying the dimensions to scale over.

centre_torch.Tensor, default=None

The centre of each feature of shape X.

scale_torch.Tensor, default=None

The scale of each feature of shape ``X`.

See also

mvpy.preprocessing.Scaler

An alternative scaler that normalises data to zero mean and unit variance.

mvpy.preprocessing.Clamp

A complementary class that implements clamping data at specific values.

Examples

>>> import torch
>>> from mvpy.preprocessing import RobustScaler
>>> scaler = RobustScaler().to_torch()
>>> X = torch.normal(5, 10, (1000, 5))
>>> X[500,0] = 1e3
>>> X.std(0)
tensor([32.9122,  9.9615, 10.1481, 10.1058,  9.7468])
>>> Z = scaler.fit_transform(X)
>>> Z.std(0)
tensor([2.7348, 0.7351, 0.7464, 0.7609, 0.7154])
>>> H = scaler.inverse_transform(Z)
>>> H.std(0)
tensor([32.9122,  9.9615, 10.1481, 10.1058,  9.7468])
clone() RobustScaler[source]#

Obtain a clone of this class.

Returns:
scalerRobustScaler

The cloned robust scaler.

copy() RobustScaler[source]#

Obtain a copy of this class.

Returns:
scalerRobustScaler

The copied robust scaler.

fit(X: ndarray | Tensor, *args: Any) RobustScaler[source]#

Fit the scaler.

Parameters:
Xnp.ndarray | torch.Tensor

The data of arbitrary shape.

argsAny

Additional arguments.

Returns:
scalersklearn.base.BaseEstimator

The fitted scaler.

fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Fit and transform the data in one step.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

The transformed data of shape X.

inverse_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Invert the transform of the data.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Xnp.ndarray | torch.Tensor

The inverse transformed data of shape X.

to_numpy() _RobustScaler_numpy[source]#

Select the numpy backend. Note that this cannot be called for conversion.

Returns:
scaler_RobustScaler_numpy

The robust scaler using the numpy backend.

to_torch() _RobustScaler_torch[source]#

Select the torch backend. Note that this cannot be called for conversion.

Returns:
scaler_RobustScaler_torch

The robust scaler using the torch backend.

transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Transform the data using scaler.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

The transformed data of shape X.