SVC#

class mvpy.estimators.SVC(method: str = 'OvR', C: float = 1.0, kernel: str = 'linear', gamma: str | float = 'scale', coef0: float = 0.0, degree: float = 3.0, tol: float = 0.001, lr: float = 0.001, max_iter: int = 1000)[source]#

Implements a support vector classifier.

Support vector classifiers frame a classification problem mapping from neural data \(X\) to labels \(y\in\{1, -1\}\) as a max-margin problem:

\[f(X) = w^T\varphi(X) + b\]

that separates the classes with the largest possible margin in feature space \(\varphi(\cdot)\). As in KernelRidgeClassifier, \(\varphi(X)\) is a gram matrix defined by some kernel function. Contrary to KernelRidgeClassifier, however, SVC minimises a hinge-loss surrogate:

\[\arg\min_{w, b} \frac{1}{2}\lvert\lvert w\rvert\rvert^2 + C\sum_i\max\left(0, 1 - y_i f(X_i)\right)\]

Via the kernel trick, the decision function can be written in dual form as:

\[f(X) = \sum_{i\in\mathcal{S}} \alpha_i y_i \kappa(X_i, X) + b\]

where \(\alpha_i\ge 0\), and \(\kappa\) is a positive-definite kernel. Hyperparameters like the penalisation \(C\) are typically selected by cross-validation. Unlike KernelRidgeClassifier, penalty selection cannot be conveniently automated through LOO-CV here.

Compared to RidgeClassifier or KernelRidgeClassifier, SVC optimises a margin-based objective and often yields tighter decision boundaries, particularly when classes are not well separated linearly or when using non-linear kernel–at the cost of higher training time.

For more information on support vector classifiers, see [1].

Warning

SVC is currently considered experimental. As is, it uses gradient ascent over vectorised features and stops early when \(\Delta\lvert\lvert grad\rvert\rvert\) is smaller than some tolerance. This diverges from sklearn’s behaviour and may produce slightly degraded decision boundaries. In the future, we will be switching to an SMO routine that should resolve these issues.

Parameters:

method{‘OvR’, ‘OvO’}, default=’OvR’: For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?
Cfloat, default=1.0: Regularisation strength is inversely related to C.
kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’: Which kernel function should we use (linear, poly, rbf, sigmoid)?
gamma{‘scale’, ‘auto’, float}, default=’scale’: What gamma to use for poly, rbf and sigmoid. Available methods are scale or auto, or positive float.
coef0float, default=0.0: What offset to use for poly and sigmoid.
degreefloat, default=3.0: What degree polynomial to use (if any).
tolfloat, default=1e-3: Tolerance over maximum update step (i.e., when maximal gradient < tol, early stopping is triggered).
lrfloat, default=1e-3: The learning rate.
max_iterint, default=1000: The maximum number of iterations to perform while fitting, or -1 to disable.

Attributes:

method{‘OvR’, ‘OvO’}, default=’OvR’: For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?
Cfloat, default=1.0: Regularisation strength is inversely related to C.
kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’: Which kernel function should we use (linear, poly, rbf, sigmoid)?
gamma{‘scale’, ‘auto’, float}, default=’scale’: What gamma to use for poly, rbf and sigmoid. Available methods are scale or auto, or positive float.
coef0float, default=0.0: What offset to use for poly and sigmoid.
degreefloat, default=3.0: What degree polynomial to use (if any).
tolfloat, default=1e-3: Tolerance over maximum update step (i.e., when maximal gradient < tol, early stopping is triggered).
lrfloat, default=1e-3: The learning rate.
max_iterint, default=1000: The maximum number of iterations to perform while fitting, or -1 to disable.
X_train_np.ndarray | torch.Tensor: A clone of the training data used internally for kernel estimation.
A_np.ndarray | torch.Tensor: A clone of the alpha data used internally for kernel estimation.
gamma_float: Estimated gamma parameter.
eps_float, default=1e-12: Error margin for support vectors used internally.
w_np.ndarray | torch.Tensor: If linear kernel, estimated weights.
p_np.ndarray | torch.Tensor: If linear kernel, estimated patterns.
intercept_np.ndarray | torch.Tensor: The intercept vector.
coef_np.ndarray | torch.Tensor: If kernel is linear, the coefficients of the model.
pattern_np.ndarray | torch.Tensor: If kernel is linear, the patterns used by the model.
binariser_mvpy.preprocessing.LabelBinariser: The binariser used internally.
scaler_mvpy.preprocessing.Scaler: The scaler used internally.
metric_mvpy.metrics.accuracy: The default metric to use.

See also

mvpy.math.kernel_linear, mvpy.math.kernel_poly, mvpy.math.kernel_rbf, mvpy.math.kernel_sigmoid: Available kernel functions.

Notes

Coefficients are interpretable only when kernel is linear. In this case, patterns are computed as per [2].

References

[1]

Awad, M., & Khanna, R. (2015). Support vector machines for classification. Efficient Learning Machines, 39-66. 10.1007/F978-1-4302-5990-9_3

[2]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

First, let’s look at a case where we have one feature that has two classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import make_circles
>>> X, y = make_circles(noise = 0.3)
>>> X, y = torch.from_numpy(X).float(), torch.from_numpy(y).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.squeeze(), y)
tensor(0.6700)

Second, let’s look at a case where we have one feature that has three classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y = True)
>>> X, y = torch.from_numpy(X).float(), torch.from_numpy(y).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.squeeze(), y)
tensor(0.9733)

Third, let’s look at a case where we have two features with a variable number of classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import make_classification
>>> X0, y0 = make_classification(n_classes = 3, n_informative = 6)
>>> X1, y1 = make_classification(n_classes = 4, n_informative = 8)
>>> X = torch.from_numpy(np.concatenate((X0, X1), axis = -1)).float()
>>> y = torch.from_numpy(np.stack((y0, y1), axis = -1)).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.T, y.T)
tensor([1.000, 0.9800])

clone() → SVC[source]#

Clone this class.

Returns:

svcmvpy.estimators.SVC: The cloned object.

copy() → SVC[source]#

Clone this class.

Returns:

svcmvpy.estimators.SVC: The cloned object.

decision_function(X: ndarray | Tensor) → ndarray | Tensor[source]#

Predict from the estimator.

Parameters:

Xnp.ndarray | torch.Tensor: The features of shape (n_samples, n_channels).

Returns:

dfnp.ndarray | torch.Tensor: The predictions of shape (n_samples, n_classes).

fit(X: ndarray | Tensor, y: ndarray | Tensor) → BaseEstimator[source]#

Fit the estimator.

Parameters:

Xnp.ndarray | torch.Tensor: The features of shape (n_samples, n_channels).
ynp.ndarray | torch.Tensor: The targets of shape (n_samples[, n_features]).

Returns:

clfmvpy.estimators.SVC: The classifier.

predict(X: ndarray | Tensor) → ndarray | Tensor[source]#

Predict from the estimator.

Parameters:

Xnp.ndarray | torch.Tensor: The features of shape (n_samples, n_channels).

Returns:

y_hnp.ndarray | torch.Tensor: The predictions of shape (n_samples, n_features).

predict_proba(X: ndarray | Tensor) → ndarray | Tensor[source]#

Predict from the estimator.

Parameters:

Xnp.ndarray | torch.Tensor: The features (n_samples, n_channels).

Returns:

pnp.ndarray | torch.Tensor: The predictions of shape (n_samples, n_classes).

Warning

Probabilities are computed from expit() over outputs of decision_function(). Consequently, probability estimates returned by this class are not calibrated. See Classifier for more information.

Make predictions from \(X\) and score against \(y\).

Parameters:

Xnp.ndarray | torch.Tensor: Input data of shape (n_samples, n_channels).
ynp.ndarray | torch.Tensor: Output data of shape (n_samples, n_features).
metricOptional[Metric | Tuple[Metric]], default=None: Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:

scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray], Dict[str, torch.Tensor]: Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

to_numpy() → BaseEstimator[source]#

Obtain the estimator with numpy as backend.

Returns:

svcmvpy.estimators._SVC_numpy: The estimator.

to_torch() → BaseEstimator[source]#

Obtain the estimator with torch as backend.

Returns:

svcmvpy.estimators._SVC_torch: The estimator.

SVC#

This Page