LabelBinariser#

class mvpy.preprocessing.LabelBinariser(neg_label: int = 0, pos_label: int = 1)[source]#

Class to create and handle multiclass and multifeature one-hot encodings.

For multiclass inputs, this produces a simple one hot encoding of shape (n_trials, n_classes).

For multifeature inputs, this produces a vectorised one hot encoding of shape (n_trials, n_features * n_classes) where there is one hot class per feature.

Parameters:

neg_labelint, default=0: Label to use for negatives.
pos_labelint, default=1: Label to use for positives.

Attributes:

neg_labelint, default=0: Label to use for negatives.
pos_labelint, default=1: Label to use for positives.
n_features_int: Number of unique features in y of shape (n_samples, n_features).
n_classes_List[int]: Number of unique classes per feature.
labels_List[List[Any]]: List including lists of original labels in y.
classes_List[List[Any]]: List including lists of class identities in y.
N_int | np.ndarray | torch.Tensor: Total number of classes (across features).
C_np.ndarray | torch.Tensor: Offsets for each unique feature in one-hot matrix of shape (n_features,).
map_L_to_C_List[Dict[Any, int]]: Lists containing each label->class mapping per feature.

Notes

Note that this always creates n_classes in one-hot encodings, even when n_classes=2. This is because, in some situations, it can be easier to handle the data when all classes are explicitly represented in the data.

Warning

Only the numpy backend supports string labels, as torch does not offer support for string type tensors. To avoid issues arising from this, stick to numerical labels unless you are certain to run analyses using only the numpy backend.

Examples

First, let’s consider one feature that has three classes.

>>> import torch
>>> from mvpy.estimators import LabelBinariser
>>> label = LabelBinariser().to_torch()
>>> y = torch.randint(0, 3, (100,))
>>> L = label.fit_transform(y)
>>> H = label.inverse_transform(L)
>>> print(y[0:5])
tensor([0, 1, 2, 1, 2])
>>> print(L[0:5])
tensor([[1, 0, 0],
        [0, 1, 0],
        [0, 0, 1],
        [0, 1, 0],
        [0, 0, 1]])
>>> print(H[0:5])
tensor([0, 1, 2, 1, 2])

Second, let’s look at two features that have a different number of classes each.

>>> import torch
>>> from mvpy.estimators import LabelBinariser
>>> label = LabelBinariser().to_torch()
>>> y = torch.stack((torch.randint(10, 13, (50,)), torch.randint(20, 22, (50,))), dim = 1)
>>> L = label.fit_transform(y)
>>> H = label.inverse_transform(L)
>>> print(y[0:5])
tensor([[10, 21],
        [10, 20],
        [11, 21],
        [12, 21],
        [10, 20]])
>>> print(L[0:5])
tensor([[1, 0, 0, 0, 1],
        [1, 0, 0, 1, 0],
        [0, 1, 0, 0, 1],
        [0, 0, 1, 0, 1],
        [1, 0, 0, 1, 0]])
>>> print(H[0:5])
tensor([[10, 21],
        [10, 20],
        [11, 21],
        [12, 21],
        [10, 20]])

clone() → LabelBinariser[source]#

Obtain a clone of this class.

Returns:

binarisermvpy.estimators.LabelBinariser: The clone.

copy() → LabelBinariser[source]#

Obtain a copy of this class.

Returns:

binarisermvpy.estimators.LabelBinariser: The copy.

fit(y: ndarray | Tensor, *args: Any) → BaseEstimator[source]#

Fit the binariser.

Parameters:

ynp.ndarray | torch.Tensor: The data of shape (n_samples[, n_features]).
argsAny: Additional arguments.

Returns:

binarisersklearn.base.BaseEstimator: The fitted binariser.

fit_transform(y: ndarray | Tensor, *args: Any) → ndarray | Tensor[source]#

Fit and transform the data in one step.

Parameters:

ynp.ndarray | torch.Tensor: The data of shape (n_samples[, n_features]).
argsAny: Additional arguments.

Returns:

Lnp.ndarray | torch.Tensor: The binarised data of shape (n_samples, [n_features * ]n_classes).

inverse_transform(y: ndarray | Tensor, *args: Any) → ndarray | Tensor[source]#

Obtain labels from transformed data.

Parameters:

Lnp.ndarray | torch.Tensor: The binarised data of shape (n_samples, [n_features * ]n_classes).
argsAny: Additional arguments.

Returns:

ynp.ndarray | torch.Tensor: The labels of shape (n_samples, n_features).

to_numpy()[source]#

Select the numpy binariser. Note that this cannot be called for conversion.

Returns:

binarisermvpy.estimators._LabelBinariser_numpy: The numpy binariser.

to_torch()[source]#

Select the torch binariser. Note that this cannot be called for conversion.

Returns:

binarisermvpy.estimators._LabelBinariser_torch: The torch binariser.

transform(y: ndarray | Tensor, *args: Any) → ndarray | Tensor[source]#

Transform the data based on fitted binariser.

Parameters:

ynp.ndarray | torch.Tensor: The data of shape (n_samples[, n_features]).
argsAny: Additional arguments.

Returns:

Lnp.ndarray | torch.Tensor: The binarised data of shape (n_samples, [n_features * ]n_classes).

LabelBinariser#

This Page