Clamp#

class mvpy.preprocessing.Clamp(lower: float | None = None, upper: float | None = None, method: str = 'iqr', k: float | None = None, eps: float = 1e-09, dims: list | tuple | int | None = None)[source]#

Implements a clamp to handle extreme values.

Generally, this will clamp data \(X\) to lower and upper bounds defined by lower and upper whenever they are exceeded.

This can be useful for dealing with outliers: For example, in M-/EGG data that was minimally preprocessed, this may be used to curb EOG artifacts easily without removing time points or trials.

By default, both lower and upper will be None. This constitutes a special case where the bounds will then be fit directly to the data. There are three different ways of fitting bounds, controlled by method:

  1. iqr: This will compute the inter-quartile range \([0.25, 0.75]\) and clamp data where \(X\notin [\textrm{median}(X) - k L, \textrm{median}(X) + k U]\).

  2. quantile: This will clamp data outside of the quantiles given by \([k, 1 - k]\).

  3. mad: This will clamp data at \(\textrm{median}(X)\pm k \textrm{MAD}\) where MAD are median absolute deviations.

If only one of the two bounds is None instead, the unspecified bound will be interpreted as meaning no clamping in this direction to be desired.

Parameters:
lowerOptional[float], default=None

Lower bound for clamping. If None, no lower bound is applied.

upperOptional[float], default=None

Upper bound for clamping, If None, no upper bound is applied.

method{‘iqr’, ‘quantile’, ‘mad’}, default=’iqr’

If both lower and upper are None, what method to use for fitting bounds?

kOptional[float], default=None

For method iqr, scale the \([0.25, 0.75]\) quantiles by \(k\) (with default=1.5). For method quantile, clamp tails outside \([k, 1 - k]\) (with default = 0.05). For method mad, scale the median absolute deviation by \(k\) (with default=3.0). Otherwise unused.

epsfloat, default=1e-9

When checking span correctness, epsilon to apply as jitter.

dimsint, list or tuple of ints, default=None

The dimensions over which to scale (None for first dimension).

Attributes:
lowerOptional[float], default=None

Lower bound for clamping. If None, no lower bound is applied.

upperOptional[float], default=None

Upper bound for clamping, If None, no upper bound is applied.

method{‘iqr’, ‘quantile’, ‘mad’}, default=’iqr’

If both lower and upper are None, what method to use for fitting bounds?

kOptional[float], default=None

For method iqr, scale the \([0.25, 0.75]\) quantiles by \(k\) (with default=1.5). For method quantile, clamp tails outside \([k, 1 - k]\) (with default = 0.05). For method mad, scale the median absolute deviation by \(k\) (with default=3.0). Otherwise unused.

epsfloat, default=1e-9

When checking span correctness, epsilon to apply as jitter.

dimsint, list or tuple of ints, default=None

The dimensions over which to scale (None for first dimension).

lower_float | np.ndarray | torch.Tensor, default=None

Lower bound for clamping, either prespecified or fitted.

upper_float | np.ndarray | torch.Tensor, default=None

Upper bound for clamping, either prespecified or fitted.

dims_tuple[int], default=None

Tuple specifying the dimensions to scale over.

Examples

>>> import torch
>>> from mvpy.preprocessing import Clamp
>>> X = torch.normal(0, 1, (1000, 5))
>>> X[500,0] = 1e3
>>> X.max(0).values
tensor([10.0000,  3.9375,  3.2070,  3.0591,  3.0165])
>>> Z = Clamp().fit_transform(X)
>>> Z.max(0).values
tensor([2.6926, 2.7263, 2.6343, 2.6616, 2.5378])
>>> Z = Clamp(upper = 5.0).fit_transform(X)
>>> Z.max(0).values
tensor([5.0000, 3.9375, 3.2070, 3.0591, 3.0165])
clone() Clamp[source]#

Obtain a clone of this class.

Returns:
clampClamp

The cloned clamp.

copy() Clamp[source]#

Obtain a copy of this class.

Returns:
clampClamp

The copied clamp.

fit(X: ndarray | Tensor, *args: Any) Clamp[source]#

Fit the clamp.

Parameters:
Xnp.ndarray | torch.Tensor

The data of arbitrary shape.

argsAny

Additional arguments.

Returns:
clampsklearn.base.BaseEstimator

The fitted clamp.

fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Fit and transform the data in one step.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

The transformed data of shape X.

inverse_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Invert the transform of the data.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Xnp.ndarray | torch.Tensor

The inverse transformed data of shape X.

Warning

Clamping cannot be inverse transformed. Consequently, this returns the clamped values in \(X\) as is.

to_numpy() _Clamp_numpy[source]#

Select the numpy backend. Note that this cannot be called for conversion.

Returns:
clamp_Clamp_numpy

The clamp using the numpy backend.

to_torch() _Clamp_torch[source]#

Select the torch backend. Note that this cannot be called for conversion.

Returns:
clamp_Clamp_torch

The clamp using the torch backend.

transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Transform the data using the clamp.

Parameters:
Xnp.ndarray | torch.Tensor

The data of shape X.

argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

The transformed data of shape X.