LabelBinariser#
- class mvpy.preprocessing.LabelBinariser(neg_label: int = 0, pos_label: int = 1)[source]#
Class to create and handle multiclass and multifeature one-hot encodings.
For multiclass inputs, this produces a simple one hot encoding of shape
(n_trials, n_classes).For multifeature inputs, this produces a vectorised one hot encoding of shape
(n_trials, n_features * n_classes)where there is one hot class per feature.- Parameters:
- neg_labelint, default=0
Label to use for negatives.
- pos_labelint, default=1
Label to use for positives.
- Attributes:
- neg_labelint, default=0
Label to use for negatives.
- pos_labelint, default=1
Label to use for positives.
- n_features_int
Number of unique features in y of shape
(n_samples, n_features).- n_classes_List[int]
Number of unique classes per feature.
- labels_List[List[Any]]
List including lists of original labels in y.
- classes_List[List[Any]]
List including lists of class identities in y.
- N_int | np.ndarray | torch.Tensor
Total number of classes (across features).
- C_np.ndarray | torch.Tensor
Offsets for each unique feature in one-hot matrix of shape
(n_features,).- map_L_to_C_List[Dict[Any, int]]
Lists containing each label->class mapping per feature.
Notes
Note that this always creates
n_classesin one-hot encodings, even whenn_classes=2. This is because, in some situations, it can be easier to handle the data when all classes are explicitly represented in the data.Warning
Only the numpy backend supports string labels, as torch does not offer support for string type tensors. To avoid issues arising from this, stick to numerical labels unless you are certain to run analyses using only the numpy backend.
Examples
First, let’s consider one feature that has three classes.
>>> import torch >>> from mvpy.estimators import LabelBinariser >>> label = LabelBinariser().to_torch() >>> y = torch.randint(0, 3, (100,)) >>> L = label.fit_transform(y) >>> H = label.inverse_transform(L) >>> print(y[0:5]) tensor([0, 1, 2, 1, 2]) >>> print(L[0:5]) tensor([[1, 0, 0], [0, 1, 0], [0, 0, 1], [0, 1, 0], [0, 0, 1]]) >>> print(H[0:5]) tensor([0, 1, 2, 1, 2])
Second, let’s look at two features that have a different number of classes each.
>>> import torch >>> from mvpy.estimators import LabelBinariser >>> label = LabelBinariser().to_torch() >>> y = torch.stack((torch.randint(10, 13, (50,)), torch.randint(20, 22, (50,))), dim = 1) >>> L = label.fit_transform(y) >>> H = label.inverse_transform(L) >>> print(y[0:5]) tensor([[10, 21], [10, 20], [11, 21], [12, 21], [10, 20]]) >>> print(L[0:5]) tensor([[1, 0, 0, 0, 1], [1, 0, 0, 1, 0], [0, 1, 0, 0, 1], [0, 0, 1, 0, 1], [1, 0, 0, 1, 0]]) >>> print(H[0:5]) tensor([[10, 21], [10, 20], [11, 21], [12, 21], [10, 20]])
- clone() LabelBinariser[source]#
Obtain a clone of this class.
- Returns:
- binarisermvpy.estimators.LabelBinariser
The clone.
- copy() LabelBinariser[source]#
Obtain a copy of this class.
- Returns:
- binarisermvpy.estimators.LabelBinariser
The copy.
- fit(y: ndarray | Tensor, *args: Any) BaseEstimator[source]#
Fit the binariser.
- Parameters:
- ynp.ndarray | torch.Tensor
The data of shape
(n_samples[, n_features]).- argsAny
Additional arguments.
- Returns:
- binarisersklearn.base.BaseEstimator
The fitted binariser.
- fit_transform(y: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#
Fit and transform the data in one step.
- Parameters:
- ynp.ndarray | torch.Tensor
The data of shape
(n_samples[, n_features]).- argsAny
Additional arguments.
- Returns:
- Lnp.ndarray | torch.Tensor
The binarised data of shape
(n_samples, [n_features * ]n_classes).
- inverse_transform(y: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#
Obtain labels from transformed data.
- Parameters:
- Lnp.ndarray | torch.Tensor
The binarised data of shape
(n_samples, [n_features * ]n_classes).- argsAny
Additional arguments.
- Returns:
- ynp.ndarray | torch.Tensor
The labels of shape
(n_samples, n_features).
- to_numpy()[source]#
Select the numpy binariser. Note that this cannot be called for conversion.
- Returns:
- binarisermvpy.estimators._LabelBinariser_numpy
The numpy binariser.
- to_torch()[source]#
Select the torch binariser. Note that this cannot be called for conversion.
- Returns:
- binarisermvpy.estimators._LabelBinariser_torch
The torch binariser.
- transform(y: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#
Transform the data based on fitted binariser.
- Parameters:
- ynp.ndarray | torch.Tensor
The data of shape
(n_samples[, n_features]).- argsAny
Additional arguments.
- Returns:
- Lnp.ndarray | torch.Tensor
The binarised data of shape
(n_samples, [n_features * ]n_classes).