mvpy.estimators package#

Submodules#

mvpy.estimators.b2b module#

A collection of estimators for decoding and disentangling features using back2back regression.

class mvpy.estimators.b2b.B2B(alphas: Tensor | ndarray | float | int = 1, **kwargs)[source]#

Bases: BaseEstimator

Implements a back-to-back regression to disentangle causal contributions of correlated features.

The back-to-back estimator is a two-step estimator that consists of a decoder and an encoder. Effectively, the idea is to split the data \(X\) and \(y\) into two folds, decode all features in fold a, then use predictions from the decoder to encode all true features from all predictions in fold b. Consequently, this allows us to obtain a disentangled estimate of the causal contribution of each feature.

In practice, this is implemented as:

\[\hat{G} = (Y^T Y + \alpha_Y)^{-1}Y^T X\]
\[\hat{H} = (X^T X + \alpha_X)^{-1}X^T Y\hat{G}\]

where \(\hat{G}\) is the decoder and \(\hat{H}\) is the encoder, and \(\alpha\) are regularisation parameters. Consequently, the diagonal of \(\hat{H}\) contains the estimated causal contributions of our features.

For more information on B2B regression, please see [1].

Parameters:
alphastorch.Tensor | np.ndarray, default=torch.tensor([1])

The penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=False

Whether to use a different penalty for each target.

normalise_decoderbool, default=True

Whether to normalise decoder ouputs.

Attributes:
alphastorch.Tensor | np.ndarray

The penalties to use for estimation.

fit_interceptbool

Whether to fit an intercept.

normalisebool

Whether to normalise the data

alpha_per_targetbool

Whether to use a different penalty for each target.

normalise_decoderbool

Whether to normalise decoder ouputs.

decoder_mvpy.estimators.RidgeDecoder

The decoder.

encoder_mvpy.estimators.RidgeDecoder

The encoder.

scaler_mvpy.estimators.Scaler

The scaler.

causal_torch.Tensor | np.ndarray

The causal contribution of each feature of shape (n_features,).

pattern_torch.Tensor | np.ndarray

The decoded patterns of shape (n_channels, n_features).

See also

mvpy.preprocessing.Scaler

If applied, scalers used in this class.

mvpy.estimators.RidgeDecoder

Ridge decoders used for the two-step procedure here.

Notes

When penalising per target by setting alpha_per_target to True, you may want to consider normalising the decoder by also setting normalise_decoder to True. This is because otherwise decoder outputs may live on very different scales, potentially distorting the causal estimates per predictor.

Patterns are computed as per [2]. However, these patterns are not disentangled and may, consequently, be less informative than desired, depending on strength of existing correlations.

References

[1]

King, J.R., Charton, F., Lopez-Paz, D., & Oquab, M. (2020). Back-to-back regression: Disentangling the influence of correlated factors from multivariate observations. NeuroImage, 220, 117028. 10.1016/j.neuroimage.2020.117028

[2]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

>>> import torch
>>> from mvpy.estimators import B2B
>>> ß = torch.normal(0, 1, (2, 60))
>>> X = torch.normal(0, 1, (100, 2))
>>> y = X @ ß + torch.normal(0, 1, (100, 60))
>>> X, y = y, X
>>> y = torch.cat((y, y.mean(1).unsqueeze(-1) + torch.normal(0, 5, (100, 1))), 1)
>>> b2b = B2B()
>>> b2b.fit(X, y).causal_
tensor([0.4470, 0.4594, 0.0060])
clone() B2B[source]#

Clone this class.

Returns:
b2bmvpy.estimators.B2B

The cloned estimator.

fit(X: ndarray | Tensor, y: ndarray | Tensor) B2B[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The neural data of shape (n_trials, n_channels).

ynp.ndarray | torch.Tensor

The targets of shape (n_trials, n_features).

Returns:
b2bmvpy.estimators.B2B

The fitted estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The neural data of shape (n_trials, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_trials, n_features).

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.classifier module#

A collection of estimators for decoding features using ridge classifiers.

class mvpy.estimators.classifier.Classifier(estimator: BaseEstimator, method: str = 'OvR', arguments: List[Any] = [], kwarguments: Dict[Any, Any] = {})[source]#

Bases: BaseEstimator

Implements a wrapper for classifiers that handle one-versus-one (OvO) and one-versus-rest (OvR) classification schemes.

While this class is exposed publically, there are few (if any) direct use cases for this class. In principle, it exists for other classifiers that want to handle multi-class cases as OvO or OvR as a wrapper function that can either be inherited or created as a super class, specifying the desired estimator (recommended option).

One-versus-rest (OvR) classification computes the decision functions over inputs \(X\) and then takes the maximum value across decision values to predict the most likely classes \(\hat{y}\).

One-versus-one (OvO) classification computes all decision functions from binary classifiers (e.g., \(c_0\) vs \(c_1\), \(c_0\) vs \(c_2\), \(c_1\) vs \(c_2\), …). For each individual classification problem, the maximum value is recorded as one vote for the winning class. Votes are then aggregated across all classifiers and the maximum number of votes decides the most likely classes \(\hat{y}\).

Warning

When calling predict_proba(), probabilities are computed from expit() over outputs of decision_function(). While this outputs valid probabilities, they are consequently based on decision values that are on arbitrary scales. This may lead to ill-callibrated probability estimates. If accurate probability estimates are desired, please consider using CalibratedClassifier (to be implemented).

Parameters:
estimatorsklearn.base.BaseEstimator

The estimator type wrapped by this class.

method{‘OvR’, ‘OvO’}, default=’OvR’

For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?

argumentsList[Any], default=[]

Arguments to pass to the estimator at initialisation.

kwargumentsDict[str, Any], default=dict()

Keyword arguments to pass to the estimator at initialisation.

Attributes:
estimatorsklearn.base.BaseEstimator

The estimator type wrapped by this class.

method{‘OvR’, ‘OvO’}, default=’OvR’

For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?

argumentsList[Any], default=[]

Arguments to pass to the estimator at initialisation.

kwargumentsDict[str, Any], default=dict()

Keyword arguments to pass to the estimator at initialisation.

estimators_sklearn.base.BaseEstimator | List[sklearn.base.BaseEstimator]

All instances of the estimator class (only of type list if OvO).

binariser_mvpy.estimators.LabelBinariser

Label binariser used internally.

coef_np.ndarray | torch.Tensor

If available, coefficients from all classifiers ([n_classifiers,] n_channels, n_classes).

intercept_np.ndarray | torch.Tensor

If available, intercepts from all classifiers ([n_classifiers,] n_classes).

pattern_np.ndarray | torch.Tensor

If available, patterns from all classifiers ([n_classifiers,] n_channels, n_classes).

offsets_np.ndarray | torch.Tensor

Numerical offsets for each feature in outputs, used internally.

metric_mvpy.metrics.Accuracy

The default metric to use.

See also

mvpy.estimators.RidgeClassifier, mvpy.estimators.SVC

Classifiers that use this class as a wrapper.

mvpy.preprocessing.LabelBinariser

Label binariser used internally to generated one-hot encodings.

clone() Classifier[source]#

Clone this class.

Returns:
clfClassifier

The cloned object.

copy() Classifier[source]#

Clone this class.

Returns:
clfClassifier

The cloned object.

decision_function(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
dfnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

fit(X: ndarray | Tensor, y: ndarray | Tensor) BaseEstimator[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

The targets of shape (n_samples[, n_features]).

Returns:
clfmvpy.estimators.Classifier

The classifier.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_features).

predict_proba(X: ndarray | Tensor) ndarray | Tensor[source]#

Compute probabilities assigned to each class.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

Returns:
pnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

Warning

Probabilities are computed from expit() over outputs of decision_function() where, for method OvR, we use Wu-Lin coupling. Consequently, probability estimates returned by this class are not calibrated.

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

to_numpy() BaseEstimator[source]#

Obtain the estimator with numpy as backend.

Returns:
clfmvpy.estimators.classifier._Classifier_numpy

The estimator.

to_torch() BaseEstimator[source]#

Obtain the estimator with torch as backend.

Returns:
clfmvpy.estimators.classifier._Classifier_torch

The estimator.

Exposes decorators for compilation.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.covariance module#

A collection of estimators for covariance estimation and pre-whitening of data.

class mvpy.estimators.covariance.Covariance(method: str = 'ledoitwolf', s_min: float | None = None, s_max: float | None = None)[source]#

Bases: BaseEstimator

Implements covariance and precision estimation as well as whitening of data.

For covariance estimation, three methods are currently available through method:

  1. empirical

    This computes the empirical (sample) covariance matrix:

    \[\Sigma = \mathbb{E}\left[(X - \mathbb{E}[X])(X^T - \mathbb{E}[X^T])\right]\]

    This is computationally efficient, but produces estimates of the covariance \(\Sigma\) that may often be unfavourable: Given small datasets or noisy measurements, \(\Sigma\) may be ill- conditioned and not positive-definite with eigenvalues that tend to be systematically pushed towards the tails. In practice, this can make inversion challenging and hurts out-of-sample generalisation.

  2. ledoitwolf

    This computes the LedoitWolf shrinkage estimator:

    \[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]

    where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage intensity that minimises the the expected Frobenius-norm risk:

    \[\hat\delta = \min\left\{1, \max\left\{0, \frac{\hat\pi}{\hat\rho}\right\}\right\},\qquad \hat\rho = \lvert\lvert\Sigma - T\rvert\rvert_F^2,\qquad \hat\pi = \frac{1}{n}\sum_{k=1}^{n}\lvert\lvert x_k x_k^T - \Sigma\rvert\rvert_F^2\]

    and where:

    \[T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]

    This produces estimates that are well-conditioned and positive-definite. For more information on this procedure, please see [1].

  3. oas

    This computes the oracle approximating shrinkage estimator:

    \[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]

    where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage:

    \[\hat\delta = \frac{(1 - \frac{2}{p}) \textrm{tr}(\Sigma^2) + \textrm{tr}(\Sigma)^2}{(n + 1 - \frac{2}{p})\left(\textrm{tr}(\Sigma^2) - \frac{\textrm{tr}(\Sigma)^2}{p}\right)},\qquad T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]

    Like ledoitwolf, this procedure produces estimates that are well-conditioned and positive-definite. Contrary to ledoitwolf, shrinkage tends to be more aggressive in this procedure. For more information, please see [2].

When calling transform on this class, data will automatically be whitened based on the estimated covariance matrix. The whitening matrix is computed from the eigendecomposition as follows:

\[\Sigma = Q\Lambda Q^T,\qquad \Lambda = \textrm{diag}(\lambda_1, ..., \lambda_p) \geq 0,\qquad W = Q\Lambda^{-\frac{1}{2}}Q^T\]

For more information on whitening, refer to [3].

Parameters:
method{‘empirical’, ‘ledoitwolf’, ‘oas’}, default = ‘ledoitwolf’

Which method should be applied for estimation of covariance?

s_minfloat, default = None

What’s the minimum sample we should consider in the time dimension?

s_maxfloat, default = None

What’s the maximum sample we should consider in the time dimension?

Attributes:
covariance_np.ndarray | torch.Tensor

Covariance matrix

precision_np.ndarray | torch.Tensor

Precision matrix (inverse of covariance matrix)

whitener_np.ndarray | torch.Tensor

Whitening matrix

shrinkage_float, default=None

Shrinkage parameter, if used by method.

Notes

This class assumes features to be the second to last dimension of the data, unless there are only two dimensions (in which case it is assumed to be the last dimension).

References

[1]

Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365-411. 10.1016/S0047-259X(03)00096-4

[2]

Chen, Y., Wiesel, A., Eldar, Y.C., & Hero, A.O. (2009). Shrinkage algorithms for MMSE covariance estimation. arXiv. 10.48550/arXiv.0907.4698

[3]

Kessy, A., Lewin, A., & Strimmer, K. (2016). Optimal whitening and decorrelation. arXiv. 10.48550/arXiv.1512.00809

Examples

>>> import torch
>>> from mvpy.estimators import Covariance
>>> X = torch.normal(0, 1, (100, 10, 100))
>>> cov = Covariance(s_max = 20).fit(X)
>>> cov.covariance_.shape
torch.Size([10, 10])
clone() Covariance[source]#

Obtain a clone of this class.

Returns:
covmvpy.estimators.Covariance

The cloned object.

fit(X: ndarray | Tensor, *args: Any) Covariance[source]#

Fit the covariance estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Data to fit the estimator on of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
selfCovariance

Fitted covariance estimator.

fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Fit the covariance estimator and whiten the data.

Parameters:
Xnp.ndarray | torch.Tensor

Data to fit the estimator on and transform of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
Wnp.ndarray | torch.Tensor

Whitened data of shape (n_trials, n_features[, n_timepoints]).

to_numpy() BaseEstimator[source]#

Create the numpy estimator. Note that this function cannot be used for conversion.

Returns:
covmvpy.estimators.Covariance

The numpy estimator.

to_torch() BaseEstimator[source]#

Create the torch estimator. Note that this function cannot be used for conversion.

Returns:
covmvpy.estimators.Covariance

The torch estimator.

transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Whiten data using the fitted covariance estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Data to transform of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
Wnp.ndarray | torch.Tensor

Whitened data of shape (n_trials, n_features[, n_timepoints]).

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.csp module#

A collection of estimators for common spatial patterns.

class mvpy.estimators.csp.CSP(alphas: Tensor | ndarray | float | int = 1, **kwargs)[source]#

Bases: BaseEstimator

Implements a simple linear ridge decoder.

Parameters:
alphasUnion[torch.Tensor, np.ndarray]

The penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=False

Whether to use a different penalty for each target.

Attributes:
estimator_mvpy.estimators.RidgeCV

The ridge estimator.

pattern_Union[torch.Tensor, np.ndarray]

The decoded pattern.

coef_Union[torch.Tensor, np.ndarray]

The coefficeints of the decoder.

intercept_Union[torch.Tensor, np.ndarray]

The intercepts of the decoder.

alpha_Union[torch.Tensor, np.ndarray]

The penalties used for estimation.

Notes

After fitting the decoder, this class will also estimate the decoded patterns. This follows the approach detailed in [4]. Please also be aware that, while this class supports decoding multiple features at once, these will principally be separate regressions wherein individual contributions are not disentangled. If you would like to do this, please consider using a back-to-back decoder.

References

[4]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

>>> import torch
>>> from mvpy.estimators import Decoder
>>> X = torch.normal(0, 1, (100, 5))
>>> ß = torch.normal(0, 1, (5, 60))
>>> y = X @ ß + torch.normal(0, 1, (100, 60))
>>> decoder = Decoder(alphas = torch.logspace(-5, 10, 20)).fit(y, X)
>>> decoder.pattern_.shape
torch.Size([60, 5])
>>> decoder.predict(y).shape
torch.size([100, 5])
clone()[source]#

Clone this class.

Returns:
Decoder

The cloned object.

fit(X: ndarray | Tensor, y: ndarray | Tensor)[source]#

Fit the estimator.

Parameters:
XUnion[np.ndarray, torch.Tensor]

The features.

yUnion[np.ndarray, torch.Tensor]

The targets.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
XUnion[np.ndarray, torch.Tensor]

The features.

Returns:
Union[np.ndarray, torch.Tensor]

The predictions.

mvpy.estimators.csp.loss(C)[source]#

loss to be minimized by phams method

mvpy.estimators.csp.mean_rotation(C)[source]#
mvpy.estimators.csp.rotmat(C, i, j)[source]#

compute update matrix according to phams method see: D. T. Pham, “Joint Approximate Diagonalization of Positive Definite Hermitian Matrices,” SIAM Journal on Matrix Analysis and Applications, vol. 22, no. 4, pp. 1136–1152, Jan. 2001.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

Python part of the warnings subsystem.

mvpy.estimators.csp copy module#

mvpy.estimators.kernelridgecv module#

A collection of estimators for fitting cross-validated ridge regressions.

class mvpy.estimators.kernelridgecv.KernelRidgeCV(alphas: ndarray | Tensor | list | float | int = 1, kernel: str = 'linear', gamma: float | str = 'auto', coef0: float = 1.0, degree: float = 3.0, alpha_per_target: bool = False)[source]#

Bases: BaseEstimator

Implements a kernel ridge regression with cross-validation.

Kernel ridge regression maps input data \(X\) to output data \(y\) through coefficients \(\beta\):

\[y = \beta\kappa + \varepsilon\]

where \(\kappa\) is some gram matrix of \(X\) and solves for the model \(\beta\) through:

\[\arg\min_\beta \frac{1}{2}\lvert\lvert y - \kappa\beta\rvert\rvert_2^2 + \frac{\alpha_\beta}{2}\lvert\lvert\beta\rvert\rvert_\kappa^2\]

where \(\alpha_\beta\) are penalties to test in LOO-CV which has a convenient closed-form solution here:

\[\arg\min_{\alpha_\beta} \frac{1}{N} \sum_{i = 1}^{N} \left(\frac{y - \kappa\beta_\alpha}{1 - H_{\alpha,ii}}\right) \qquad\textrm{where}\qquad H_{\alpha,ii} = \textrm{diag}\left(\kappa\cdot\left(\kappa + \alpha_\beta I\right)^{-1}\right)\]

In other words, this solves a ridge regression in the parameter space defined by the kernel function \(\kappa(X, X)\). This is convenient because, just like SVC, it allows for non-parametric estimation. For example, kernel rbf may capture non-linearities in data that RidgeCV cannot account for. The closed-form LOO-CV formula is evaluated at all values of alphas and the penalty minimising the mean-squared loss is automatically chosen. This is convenient because it is faster than performing inner cross-validation to fine-tune penalties.

As such, KernelRidgeCV mirrors SVC in its application of the kernel trick and the associated benefits. The key difference here is that KernelRidgeCV is fit using L2 regularised squared error, whereas SVC is fit through sequential minimal optimisation or gradient ascent over hinge losses. In practice, this means that KernelRidgeCV is much faster–particularly when multiple values of alphas are specified–but produces less sparse solutions that are not margin-based.

For more information on kernel ridge regression, see [1] [2].

Parameters:
alphasnp.ndarray | torch.tensor | List | float | int, default=1.0

Alpha penalties to test.

kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’

Kernel function to use.

gamma{float, ‘auto’, ‘scale’}, default=’auto’

Gamma to use in kernel computation.

coef0float, default=1.0

Coefficient zero to use in kernel computation.

degreefloat, default=3.0

Degree of kernel to use.

alpha_per_targetbool, default=False

Should we fit one alpha per target?

Attributes:
alphasnp.ndarray | torch.tensor | List | float | int, default=1.0

Alpha penalties to test.

kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’

Kernel function to use.

gamma{float, ‘auto’, ‘scale’}, default=’auto’

Gamma to use in kernel computation.

coef0float, default=1.0

Coefficient zero to use in kernel computation.

degreefloat, default=3.0

Degree of kernel to use.

alpha_per_targetbool, default=False

Should we fit one alpha per target?

X_np.ndarray | torch.Tensor

Training data X of shape (n_samples, n_channels).

A_dual_np.ndarray | torch.Tensor

Chosen dual alpha of shape (n_samples, n_features).

alpha_float | np.ndarray | torch.Tensor

Chosen alpha penalties.

coef_Optional[np.ndarray | torch.Tensor]

If kernel is linear, coefficients of shape (n_channels, n_features).

metric_mvpy.metrics.r2

The default metric to use.

See also

mvpy.estimators.RidgeCV

Alternative ridge regression without kernel functions.

mvpy.math.kernel_linear, mvpy.math.kernel_poly, mvpy.math.kernel_rbf, mvpy.math.kernel_sigmoid

Available kernel functions.

Notes

Coefficients coef_ are available only when kernel is linear where primal weights can be computed from dual solutions:

\[w = X^T\beta\]

For other kernel functions, coefficients are not interpretable and, therefore, not computed here.

Warning

For small values of alphas, kernel matrices may no longer be positive semidefinite. This means that, in many cases, model fitting may have to resort to least squares solutions, which can decrease through-put by an order of magnitude (or more). This issue is particularly prevalent in the numpy backend. Please consider this when choosing penalties.

Warning

This issue can also appear independently of alphas. For example, the gram matrix given \(X\sim\mathcal{N}(0, 1)\) will already be rank-deficient if \(n\_samples\geq n\_channels\). As is the case in sklearn, this will lead to poor solving speed in the numpy backend. The torch backend is more robust to this. Please consider this when investigating your data prior to model fitting.

References

[1]

Murphy, K.P. (2012). Machine learning: A probabilistic perspective. MIT Press.

[2]

Nadaraya, E.A. (1964). On estimating regression. Theory of Probability and Its Applications, 9, 141-142. 10.1137/1109020

Examples

>>> import torch
>>> from mvpy.estimators import KernelRidgeCV
>>> ß = torch.normal(0, 1, size = (5,))
>>> X = torch.normal(0, 1, size = (240, 5))
>>> y = ß @ X.T + torch.normal(0, 0.5, size = (X.shape[0],))
>>> model = KernelRidgeCV().fit(X, y)
>>> model.coef_
clone() KernelRidgeCV[source]#

Make a clone of this class.

Returns:
estimatorKernelRidgeCV

A clone of this class.

fit(X: ndarray | Tensor, y: ndarray | Tensor) KernelRidgeCV[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Input features of shape (n_samples, n_features).

Returns:
estimatorKernelRidgeCV

The fitted estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Make predictions from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

Predicted output features of shape (n_samples, n_features).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xtorch.Tensor

Input data of shape (n_samples, n_channels).

ytorch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

Python part of the warnings subsystem.

mvpy.estimators.receptivefield module#

A collection of estimators for ReceptiveField modeling (mTRF + SR).

class mvpy.estimators.receptivefield.ReceptiveField(t_min: float, t_max: float, fs: int, alpha: int | float | ndarray | Tensor | List = 1.0, reg_type: str | List = 'ridge', reg_cv: Any = 5, patterns: bool = False, fit_intercept: bool = True, edge_correction: bool = True)[source]#

Bases: BaseEstimator

Implements receptive field estimation (for multivariate temporal response functions or stimulus reconstruction).

Generally, mTRF models are described by:

\[r(t,n) = \sum_{\tau} w(\tau, n) s(t - \tau) + \varepsilon\]

where \(r(t,n)\) is the reconstructed signal at timepoint \(t\) for channel \(n\), \(s(t)\) is the stimulus at time \(t\), \(w(\tau, n)\) is the weight at time delay \(\tau\) for channel \(n\), and \(\varepsilon\) is the error.

SR models are estimated as:

\[s(t) = \sum_{n}\sum_{\tau} r(t + \tau, n) g(\tau, n)\]

where \(s(t)\) is the reconstructed stimulus at time \(t\), \(r(t,n)\) is the neural response at \(t\) and lagged by \(\tau\) for channel \(n\), \(g(\tau, n)\) is the weight at time delay \(\tau\) for channel \(n\).

For more information on mTRF or SR models, see [1].

Consequently, this class fundamentally solves the same problem as TimeDelayed. However, unlike TimeDelayed, this approach avoids creating and solving the full time-delayed design and outcome matrix. Instead, this approach uses the fact that we are fundamentally interested in (de-)convolution, which can be solved efficiently through estimation of auto- and cross- correlations in the Fourier domain. For more information on this approach, see [2] [3] [4].

Solving this in the Fourier domain can be extremely beneficial when the number of predictors n_features is small, but scales poorly for a higher number of n_features unless edge_correction is explicitly disabled. Generally, we would recommend testing both ReceptiveField and TimeDelayed on a realistic subset of the data before deciding for one of the two approaches.

Like TimeDelayed, this class will automatically perform inner cross-validation if multiple values of alpha are supplied.

Parameters:
t_minfloat

Minimum time point to fit (unlike TimeDelayed, this is relative to y).

t_maxfloat

Maximum time point to fit (unlike TimeDelayed, this is relative to y). Must be greater than t_min.

fsint

Sampling frequency.

alphaint | float | np.ndarray | torch.Tensor | List, default=1.0

Alpha penalties as float or of shape (n_penalties,). If not float, cross-validation will be employed (see reg_cv).

reg_type{‘ridge’, ‘laplacian’, List}, default=’ridge’

Type of regularisation to employ (either ‘ridge’ or ‘laplacian’ or tuple describing (time, features)).

reg_cv{int, ‘LOO’, mvpy.crossvalidation.KFold}, default=5

If alpha is list or array, what cross-validation scheme should we use? Integers are interpeted as n_splits for KFold. String input 'LOO' will use RidgeCV to solve LOO-CV over alphas, but is available only for reg_type 'ridge'. Alternatively, a cross-validator that exposes a split() method can be supplied.

patternsbool, default=False

Should we estimate the patterns from coefficients and data (useful only for stimulus reconstruction, not mTRF)?

fit_interceptbool, default=True

Should we fit an intercept for this model?

edge_correctionbool, default=True

Should we apply edge corrections to auto-correlations?

Attributes:
t_minfloat

Minimum time point to fit (unlike TimeDelayed, this is relative to y).

t_maxfloat

Maximum time point to fit (unlike TimeDelayed, this is relative to y). Must be greater than t_min.

fsint

Sampling frequency.

alphaint | float | np.ndarray | torch.Tensor | List, default=1.0

Alpha penalties as float or of shape (n_penalties,). If not float, cross-validation will be employed (see reg_cv).

reg_type{‘ridge’, ‘laplacian’, List}, default=’ridge’

Type of regularisation to employ (either ‘ridge’ or ‘laplacian’ or tuple describing (time, features)).

reg_cv{int, ‘LOO’, mvpy.crossvalidation.KFold}, default=5

If alpha is list or array, what cross-validation scheme should we use? Integers are interpeted as n_splits for KFold. String input 'LOO' will use RidgeCV to solve LOO-CV over alphas, but is available only for reg_type 'ridge'. Alternatively, a cross-validator that exposes a split() method can be supplied.

patternsbool, default=False

Should we estimate the patterns from coefficients and data (useful only for stimulus reconstruction, not mTRF)?

fit_interceptbool, default=True

Should we fit an intercept for this model?

edge_correctionbool, default=True

Should we apply edge corrections to auto-correlations?

s_minint

t_min converted to samples.

s_maxint

t_max converted to samples.

windownp.ndarray | torch.Tensor

The TRF window ranging from s_min-s_max of shape (n_trf,).

n_features_int

Number of features in \(X\).

n_channels_int

Number of channels in \(y\).

n_trf_int

Number of timepoints in the estimated response functions.

cov_np.ndarray | torch.Tensor

Covariance from auto-correlations of shape (n_samples, n_features * n_trf, n_features * n_trf).

coef_np.ndarray | torch.Tensor

Estimated coefficients of shape (n_channels, n_features, n_trf).

pattern_np.ndarray | torch.Tensor

If computed, estimated pattern of shape (n_channels, n_features, n_trf).

intercept_float | np.ndarray | torch.Tensor

Estimated intercepts of shape (n_channels,) or float.

metric_mvpy.metrics.r2

The default metric to use.

See also

mvpy.estimators.TimeDelayed

An alternative mTRF/SR estimator that solves the time-expanded design matrix.

mvpy.crossvalidation.KFold, mvpy.crossvalidation.RepeatedKFold, mvpy.crossvalidation.StratifiedKFold, mvpy.crossvalidation.RepeatedStratifiedKFold

Cross-validation classes for automatically testing multiple values of alpha.

Notes

For SR models it is recommended to also set patterns to True to estimate not only the coefficients but also the patterns that were actually used for reconstructing stimuli. For more information, see [5].

Warning

Unlike TimeDelayed, this class expects t_min and t_max to be causal in \(y\). Consequently, positive values mean \(X(t)\) asserts influence over \(y(t + \tau)\). This is in line with MNE’s behaviour.

References

[1]

Crosse, M.J., Di Liberto, G.M., Bednar, A., & Lalor, E.C. (2016). The multivariate temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience, 10, 604. 10.3389/fnhum.2016.00604

[2]

Willmore, B., & Smyth, D. (2009). Methods for first-order kernel estimation: Simple-cell receptive fields from responses to natural scenes. Network: Computation in Neural Systems, 14, 553-577. 10.1088/0954-898X_14_3_309

[3]

Theunissen, F.E., David, S.V., Singh, N.C., Hsu, A., Vinje, W.E., & Gallant, J.L. (2001). Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network: Computation in Neural Systems, 12, 289-316. 10.1080/net.12.3.289.316

[5]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

For mTRF estimation, we can do:

>>> import torch
>>> from mvpy.estimators import ReceptiveField
>>> ß = torch.tensor([1., 2., 3., 2., 1.])
>>> X = torch.normal(0, 1, (100, 1, 50))
>>> y = torch.nn.functional.conv1d(X, ß[None,None,:], padding = 'same')
>>> y = y + torch.normal(0, 1, y.shape)
>>> trf = ReceptiveField(-2, 2, 1, alpha = 1e-5)
>>> trf.fit(X, y).coef_
tensor([[[0.9912, 2.0055, 2.9974, 1.9930, 0.9842]]])

For stimulus reconstruction, we can do:

>>> import torch
>>> from mvpy.estimators import ReceptiveField
>>> ß = torch.tensor([1., 2., 3., 2., 1.])
>>> X = torch.arange(50)[None,None,:] * torch.ones((100, 1, 50))
>>> y = torch.nn.functional.conv1d(X, ß[None,None,:], padding = 'same')
>>> y = y + torch.normal(0, 1, y.shape)
>>> X, y = y, X
>>> sr = ReceptiveField(-2, 2, 1, alpha = 1e-3, patterns = True).fit(X, y)
>>> sr.predict(X).mean(0)[0,:]
tensor([ 0.2148,  0.7017,  1.4021,  2.3925,  3.5046,  4.4022,  5.4741,  6.4759,
         7.5530,  8.4915,  9.6014,  10.5186, 11.5872, 12.6197, 13.5862, 14.6769,
         15.6523, 16.6765, 17.6622, 18.7172, 19.7117, 20.7994, 21.7023, 22.7885,
         23.8434, 24.7849, 25.8697, 26.8705, 27.8523, 28.9028, 29.9428, 30.9342,
         31.9401, 32.9729, 33.9704, 34.9847, 36.0325, 37.0251, 38.0297, 39.0678,
         40.0847, 41.0827, 42.1410, 43.0924, 44.2115, 45.1548, 41.9511, 45.9482,
         32.2861, 76.4690])
clone() ReceptiveField[source]#

Clone this class.

Returns:
rfReceptiveField

The cloned object.

fit(X: ndarray | Tensor, y: ndarray | Tensor) ReceptiveField[source]#

Fit the estimator, optionally with cross-validation over penalties.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

ynp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels, n_timepoints).

Returns:
rfmvpy.estimators._ReceptiveField_numpy | mvpy.estimators._ReceptiveField_torch

The fitted ReceptiveField estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Make predictions from model.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

Returns:
y_hnp.ndarray | torch.Tensor

Predicted responses of shape (n_samples, n_channels, n_timepoints).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_channels, n_timepoints).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_channels, n_timepoints) or, for multiple metrics, a dictionary of metric names and scores of shape (n_channels, n_timepoints).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Exposes decorators for compilation.

This module provides access to the mathematical functions

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.ridgeclassifier module#

A collection of estimators for ridge classification.

class mvpy.estimators.ridgeclassifier.RidgeClassifier(alphas: Tensor | ndarray | float | int = 1, method: str = 'OvR', fit_intercept: bool = True, normalise: bool = True, alpha_per_target: bool = False)[source]#

Bases: BaseEstimator

Implements a linear ridge classifier.

Ridge classifiers effectively frame a classification problem as a simple linear ridge regression, mapping from neural data \(X\) to labels \(y\) through spatial filters \(\beta\):

\[y = \beta X + \varepsilon\quad\textrm{where}\quad y\in\{-1, 1\}\]

Consequently, we solve for spatial filters through:

\[\arg\min_{\beta} \sum_{i}(y_i - \beta^TX_i)^2 + \alpha_\beta\lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are the penalties to test in LOO-CV.

This linear filter estimation is extremely convenient for neural decoding because, unlike other decoding approaches such as SVC, this can be solved extremely efficiently and, for many decoding tasks, will perform well.

Parameters:
alphanp.ndarray | torch.Tensor

The penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=False

Whether to fit individual alphas per target.

Attributes:
alphanp.ndarray | torch.Tensor

The penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=False

Whether to fit individual alphas per target.

estimatormvpy.estimators.RidgeDecoder

The ridge estimator.

binariser_mvpy.preprocessing.LabelBinariser

The label binariser used internally.

intercept_np.ndarray | torch.Tensor

The intercepts of the classifier.

coef_np.ndarray | torch.Tensor

The coefficients of the classifier.

pattern_np.ndarray | torch.Tensor

The patterns of the classifier.

metric_mvpy.metrics.Accuracy

The default metric to use.

Notes

By default, this will not allow alpha values to differ between targets. In certain situations, this may be desirable, however. In the multi-class case, it should be carefully evaluated whether or not alpha_per_target should be enabled, as here it may also hurt decoding performance if penalties are on different scales and method is OvR.

Coefficients are transformed to patterns to facilitate interpretation thereof. For more information, please see [1].

References

[1]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

We can either do classification over a single feature, like so:

>>> import torch
>>> from mvpy.estimators import RidgeClassifier
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y = True)
>>> X, y = torch.from_numpy(X).float(), torch.from_numpy(y).float()
>>> clf = RidgeClassifier(torch.logspace(-5, 10, 20)).fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.squeeze(), y)
tensor(0.8533)

Or we can also do classification over multiple features, like so:

>>> import torch
>>> from mvpy.estimators import RidgeClassifier
>>> from sklearn.datasets import make_classification
>>> X0, y0 = make_classification(n_classes = 3, n_informative = 6)
>>> X1, y1 = make_classification(n_classes = 4, n_informative = 8)
>>> X = torch.from_numpy(np.concatenate((X0, X1), axis = -1)).float()
>>> y = torch.from_numpy(np.stack((y0, y1), axis = -1)).float()
>>> clf = RidgeClassifier(torch.logspace(-5, 10, 20)).fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.T, y.T)
torch.tensor([0.82, 0.75])
clone() RidgeClassifier[source]#

Clone this class.

Returns:
clfRidgeClassifier

The cloned object.

copy() RidgeClassifier[source]#

Clone this class.

Returns:
clfRidgeClassifier

The cloned object.

decision_function(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
dfnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

fit(X: ndarray | Tensor, y: ndarray | Tensor) BaseEstimator[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

The targets of shape (n_samples[, n_features]).

Returns:
clfClassifier

The classifier.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_features).

predict_proba(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
pnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

Warning

Probabilities are computed from expit() over outputs of decision_function(). Consequently, probability estimates returned by this class are not calibrated. See Classifier for more information.

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray], Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.ridgecv module#

A collection of estimators for fitting cross-validated ridge regressions.

class mvpy.estimators.ridgecv.RidgeCV(alphas: ndarray | Tensor | list | float | int = 1, fit_intercept: bool = True, normalise: bool = True, alpha_per_target: bool = False)[source]#

Bases: BaseEstimator

Implements ridge regression with cross-validation.

Ridge regression maps input data \(X\) to output data \(y\) through coefficients \(\beta\):

\[y = \beta X + \varepsilon\]

and solves for the model \(\beta\) through:

\[\arg\min_\beta \sum_i (y_i - \beta^T X_i)^2 + \alpha_\beta\lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are penalties to test in LOO-CV which has a convenient closed-form solution here:

\[\arg\min_{\alpha_\beta} \frac{1}{N}\sum_{i = 1}^{N} \left(\frac{y - \beta_\alpha X}{1 - H_{\alpha,ii}}\right)\qquad \textrm{where}\qquad H_{\alpha,ii} = \textrm{diag}\left(X(X^T X + \alpha I)^{-1}X^T\right)\]

As such, this will automatically evaluate the LOO-CV of all values of alphas and chose the penalty that minimises the mean-squared loss. This is convenient because it is much faster than performing inner cross-validation to fine-tune penalties.

For more information on ridge regression, see [1]. This implementation follows [2].

Parameters:
alphasnp.ndarray | torch.Tensor | List | float | int, default=1

Penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=True

Whether to use a different penalty for each target.

Attributes:
alpha_np.ndarray | torch.Tensor

The penalties used for estimation.

intercept_np.ndarray | torch.Tensor

The intercepts of shape (n_features,).

coef_np.ndarray | torch.Tensor

The coefficients of shape (n_channels, n_features).

metric_mvpy.metrics.r2

The default metric to use.

Notes

If data are supplied as numpy, this class will fall back to sklearn.linear_model.RidgeCV. See [3].

References

[1]

McDonald, G.C. (2009). Ridge regression. Wiley Interdisciplinary Reviews: Computational Statistics, 1, 93-100. doi.org/10.1002/wics.14

[2]

King, J.R. (2020). torch_ridge. kingjr/torch_ridge

[3]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Examples

>>> import torch
>>> from mvpy.estimators import RidgeCV
>>> ß = torch.normal(0, 1, size = (5,))
>>> X = torch.normal(0, 1, size = (240, 5))
>>> y = ß @ X.T + torch.normal(0, 0.5, size = (X.shape[0],))
>>> model = RidgeCV().fit(X, y)
>>> model.coef_
clone() RidgeCV[source]#

Make a clone of this class.

Returns:
ridgeRidgeCV

A clone of this class.

fit(X: ndarray | Tensor, y: ndarray | Tensor) RidgeCV[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_features).

Returns:
ridgeRidgeCV

The fitted ridge estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

Predicted data of shape (n_samples, n_features).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xtorch.Tensor

Input data of shape (n_samples, n_channels).

ytorch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.ridgedecoder module#

A collection of estimators for decoding features using ridge decoders.

class mvpy.estimators.ridgedecoder.RidgeDecoder(alphas: Tensor | ndarray | float | int = 1, **kwargs)[source]#

Bases: BaseEstimator

Implements a linear ridge decoder.

This decoder maps from neural data \(X\) to features \(y\) through spatial filters \(\beta\):

\[y = \beta X + \varepsilon\]

Consequently, we solve for spatial filters through:

\[\arg\min_{\beta} \sum_{i} (y_i - \beta^T X_i)^2 + \alpha_\beta \lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are the penalties to test in LOO-CV.

Beyond what RidgeCV would also achieve, this class additionally computes the patterns used for decoding following [1].

Parameters:
alphasnp.ndarray | torch.Tensor

The penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=False

Whether to use a different penalty for each target.

Attributes:
estimator_mvpy.estimators.RidgeCV

The ridge estimator.

pattern_np.ndarray | torch.Tensor

The decoded pattern of shape (n_channels, n_features).

coef_np.ndarray | torch.Tensor

The coefficeints of the decoder of shape (n_features, n_channels).

intercept_np.ndarray | torch.Tensor

The intercepts of the decoder of shape (n_features,).

alpha_np.ndarray | torch.Tensor

The penalties used for estimation.

metric_mvpy.metrics.r2

The default metric to use.

See also

mvpy.estimators.RidgeCV

The estimator used for ridge decoding.

mvpy.estimators.B2B

An alternative decoding estimator that explicitly disentangles correlated features.

Notes

While this class supports decoding an arbitrary number of features at once, all features will be treated as individual regressions. Consequently, this class cannot control for correlations among predictors. If this is desired, refer to B2B instead.

References

[1]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

>>> import torch
>>> from mvpy.estimators import RidgeDecoder
>>> X = torch.normal(0, 1, (100, 5))
>>> ß = torch.normal(0, 1, (5, 60))
>>> y = X @ ß + torch.normal(0, 1, (100, 60))
>>> decoder = RidgeDecoder(alphas = torch.logspace(-5, 10, 20)).fit(y, X)
>>> decoder.pattern_.shape
torch.Size([60, 5])
>>> decoder.predict(y).shape
torch.size([100, 5])
clone() RidgeDecoder[source]#

Clone this class.

Returns:
decodermvpy.estimators.RidgeDecoder

The cloned object.

fit(X: ndarray | Tensor, y: ndarray | Tensor)[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The neural data of shape (n_trials, n_channels).

ynp.ndarray | torch.Tensor

The features of shape (n_trials, n_features).

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The neural data of shape (n_trials, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_trials, n_features).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xtorch.Tensor

Input data of shape (n_samples, n_channels).

ytorch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.ridgeencoder module#

A collection of estimators for encoding features using ridge regressions.

class mvpy.estimators.ridgeencoder.RidgeEncoder(alphas: Tensor | ndarray | float | int = 1, **kwargs)[source]#

Bases: BaseEstimator

Implements a linear ridge encoder.

This encoder maps features \(X\) to neural data \(y\) through the forward model \(\beta\):

\[y = \beta X + \varepsilon\]

Consequently, we solve for the forward model through:

\[\arg\min_{\beta} \sum_i(y_i - \beta^T X_i)^2 + \alpha_\beta \lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are the penalties to test in LOO-CV.

Unlike a standard RidgeCV, this class also supports solving for the full encoding model (including all time points) at once, using a single alpha. This may be useful when trying to avoid different alphas at different time steps, as would be the case when using Sliding to slide over the temporal dimension when encoding.

Parameters:
alphasnp.ndarray | torch.Tensor | float | int, default=1

The penalties to use for estimation.

kwargsAny

Additional arguments.

Attributes:
alphasnp.ndarray | torch.Tensor

The penalties to use for estimation.

kwargsAny

Additional arguments for the estimator.

estimatormvpy.estimators.RidgeCV

The estimator to use.

intercept_np.ndarray | torch.Tensor

The intercepts of the encoder of shape (1, n_channels).

coef_np.ndarray | torch.Tensor

The coefficients of the encoder of shape (n_features, n_channels[, n_timepoints]).

metric_mvpy.metrics.r2

The default metric to use.

See also

mvpy.estimators.RidgeCV

The estimator used for encoding.

mvpy.estimators.TimeDelayed, mvpy.estimators.ReceptiveField

Alternative estimators for explicitly modeling temporal response functions.

Notes

This assumes a one-to-one mapping in feature and neural time. This is, of course, principally wrong, but may be good enough when we have a simple set of features and want to find out at what points in time they might correspond to neural data, for example for regressing semantic embeddings on neural data. For more explicit modeling of temporal response functions, see TimeDelayed or ReceptiveField.

Examples

Let’s say we want to do a very simple encoding:

>>> import torch
>>> from mvpy.estimators import RidgeEncoder
>>> ß = torch.normal(0, 1, (50,))
>>> X = torch.normal(0, 1, (100, 50))
>>> y = X @ ß
>>> y = y[:,None] + torch.normal(0, 1, (100, 1))
>>> encoder = RidgeEncoder().fit(X, y)
>>> encoder.coef_.shape
torch.Size([1, 50])

Next, let’s assume we want to do a temporally expanded encoding instead:

>>> import torch
>>> from mvpy.estimators import RidgeEncoder
>>> X = torch.normal(0, 1, (240, 5, 100))
>>> ß = torch.normal(0, 1, (60, 5, 100))
>>> y = torch.stack([torch.stack([X[:,:,i] @ ß[j,:,i] for i in range(X.shape[2])], 0) for j in range(ß.shape[0])], 0).swapaxes(0, 2).swapaxes(1, 2)
>>> y = y + torch.normal(0, 1, y.shape)
>>> encoder = RidgeEncoder().fit(X, y)
>>> encoder.coef_.shape
torch.Size([60, 5, 100])
clone() RidgeEncoder[source]#

Clone this class.

Returns:
encodermvpy.estimators.RidgeEncoder

The cloned object.

fit(X: ndarray | Tensor, y: ndarray | Tensor) RidgeEncoder[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_trials, n_features[, n_timepoints]).

ynp.ndarray | torch.Tensor

The neural data of shape (n_trials, n_channels[, n_timepoints]).

Returns:
encodermvpy.estimators.RidgeEncoder

The fitted encoder.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_trials, n_features[, n_timepoints]).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_trials, n_channels[, n_timepoints]).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xtorch.Tensor

Input data of shape (n_trials, n_features[, n_timepoints]).

ytorch.Tensor

Output data of shape (n_trials, n_channels[, n_timepoints]).

metricOptional[Metric], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features[, n_timepoints]) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features[, n_timepoints]).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.rsa module#

A collection of estimators for computing representational similarities.

class mvpy.estimators.rsa.RSA(grouped: bool = False, estimator: ~typing.Callable = <function euclidean>, n_jobs: int | None = None, verbose: bool = False)[source]#

Bases: BaseEstimator

Implements representational similarity analysis.

Representational similarity analysis computes the geometry of input data \(X\) in their feature space. For example, given input data \(X\) of shape (n_trials, n_channels, n_timepoints), it would compute representational (dis-)similarity matrices of shape (n_trials, n_trials, n_timepoints) through some (dis-)similarity function \(f\).

Generally, performing this over representations of different systems allows drawing second-order comparisons about shared properties of those systems. This is typically done by computing multiple (dis-)similarity matrices from neural and simulated data before comparing the two using, for example, spearmanr() to obtain a measure of how similar some hypothetical simulated system is to the observed neural geometry.

For more information on representational similarity analysis, please see [1] [2].

Parameters:
groupedbool, default=False

Whether to use a grouped RSA (this is required for cross-validated metrics to make sense, irrelevant otherwise).

estimatorCallable, default=mvpy.math.euclidean

The estimator/metric to use for RDM computation.

n_jobsint, default=None

Number of jobs to run in parallel (default = None).

verbosebool, default=False

Whether to print progress information.

Attributes:
rdm_np.ndarray | torch.Tensor

The upper triangle of the representational (dis-)similarity matrix of shape (n_triu_indices[, ...]).

cx_np.ndarray | torch.Tensor

The upper triangular indices of the RDM.

cy_np.ndarray | torch.Tensor

The upper triangular indices of the RDM.

groupedbool

Whether the RSA is grouped.

estimatorCallable

The estimator/metric to use for RDM computation.

n_jobsint

Number of jobs to run in parallel.

verbosebool, default=False

Whether to print progress information.

Notes

Computing (dis-)similarity across input data \(X\) may be inherently biassed. For example, distance metrics like euclidean() or mahalanobis() may never truly be zero given the noise inherent to neural responses. Consequently, cross-validation can be applied to compute unbiassed estimators through mvpy.math.cv_euclidean() or mvpy.math.cv_mahalanobis(). To do this, make sure to collect many trials per condition and structure your data \(X\) as (n_trials, n_groups, n_channels, n_timepoints) while setting grouped True.

For more information on cross-validation, please see [3].

References

[1]

Kriegeskorte, N. (2008). Representational similarity analaysis - connecting the branches of systems neuroscience. Frontiers in Systems Neuroscience. 10.3389/neuro.06.004.2008

[2]

Diedrichsen, J., & Kriegeskorte, N. (2017). Representational models: A common framework for understanding encoding, pattern-component, and representational similarity analysis. PLOS Computational Biology, 13, e1005508. 10.1371/journal.pcbi.1005508

[3]

Diedrichsen, J., Provost, S., & Zareamoghaddam, H. (2016). On the distribution of cross-validated mahalanobis distances. arXiv. 10.48550/arXiv.1607.01371

Examples

Let’s assume we have some data with 100 trials and 5 groups, recording 10 channels over 50 time points:

>>> import torch
>>> from mvpy.math import euclidean, cv_euclidean
>>> from mvpy.estimators import RSA
>>> X = torch.normal(0, 1, (100, 5, 10, 50))
>>> rsa = RSA(estimator = euclidean)
>>> rsa.transform(X).shape
torch.Size([4950, 5, 50])

If we want to compute a cross-validated RSA over the groups instead, we can use:

>>> rsa = RSA(grouped = True, estimator = cv_euclidean)
>>> rsa.transform(X).shape
torch.Size([10, 50])

Finally, if we want to plot the full RDM, we can do:

>>> rdm = torch.zeros((5, 5, 50))
>>> rdm[rsa.cx_, rsa.cy_] = rsa.rdm_
>>> import matplotlib.pyplot as plt
>>> plt.imshow(rdm[...,0], cmap = 'RdBu_r')
clone()[source]#

Clone this class.

Returns:
rsamvpy.estimators.RSA

A clone of this class.

fit(X: ndarray | Tensor, *args: Any) RSA[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The data to compute the RDM for of shape (n_trials[, n_groups], n_channels, n_timepoints).

argsAny

Additional arguments

Returns:
rsamvpy.estimators.RSA

Fitted RSA estimator.

fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Fit the estimator and transform data into representational similarity.

Parameters:
Xnp.ndarray | torch.Tensor

The data to compute the RDM for of shape (n_trials[, n_groups], n_channels, n_timepoints).

argsAny

Additional arguments

Returns:
rdmnp.ndarray | torch.Tensor

The representational similarity matrix of shape (n_trials, n_trials, n_timepoints) or (n_groups, n_groups, n_timepoints).

full_rdm() ndarray | Tensor[source]#

Obtain the full representational similartiy matrix.

Returns:
rdmnp.ndarray | torch.Tensor

The representational similarity matrix of shape (n_trials, n_trials, n_timepoints) or (n_groups, n_groups, n_timepoints).

to_numpy()[source]#

Make this estimator use the numpy backend. Note that this method does not support conversion between types.

Returns:
rsasklearn.base.BaseEstimator

The estimator.

to_torch()[source]#

Make this estimator use the torch backend. Note that this method does not support conversion between types.

Returns:
rsasklearn.base.BaseEstimator

The estimator.

transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Transform the data into representational similarity.

Parameters:
Xnp.ndarray | torch.Tensor

The data to compute the RDM for of shape (n_trials[, n_groups], n_channels, n_timepoints).

argsAny

Additional arguments

Returns:
rdmnp.ndarray | torch.Tensor

The representational similarity matrix of shape (n_trials, n_trials, n_timepoints) or (n_groups, n_groups, n_timepoints).

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.sliding module#

A collection of estimators that allow for sliding other estimators over a dimension of the data.

class mvpy.estimators.sliding.Sliding(estimator: Callable | BaseEstimator, dims: int | tuple | list | ndarray | Tensor = -1, n_jobs: int | None = None, top: bool = True, verbose: bool = False)[source]#

Bases: BaseEstimator

Implements a sliding estimator that allows you to fit estimators iteratively over a set of dimensions.

This is particularly useful when we have, for example, a temporal dimension in our data such that, for example, we have neural data \(X\) (n_trials, n_channels, n_timepoints) and class labels \(y\) (n_trials, n_features, n_timepoints) and want to fit a separate classifier at each time step. In this case, we can wrap our classifier object in Sliding with dims=(-1,) to automatically fit our classifiers across all timepoints.

Parameters:
estimatorCallable | sklearn.base.BaseEstimator

Estimator to use. Note that this must expose a clone() method.

dimsint | Tuple[int] | List[int] | np.ndarray | torch.Tensor, default=-1

Dimensions to slide over. Note that types are inferred here, defaulting to torch. If you are fitting a numpy estimator, please specify dims as np.ndarray.

n_jobsOptional[int], default=None

Number of jobs to run in parallel.

topbool, default=True

Is this a top-level estimator? If multiple dims are specified, this will be False in recursive Sliding objects.

verbosebool, default=False

Should progress be reported verbosely?

Attributes:
estimatorCallable | sklearn.base.BaseEstimator

Estimator to use. Note that this must expose a clone() method.

dimsint | Tuple[int] | List[int] | np.ndarray | torch.Tensor, default=-1

Dimensions to slide over. Note that types are inferred here, defaulting to torch. If you are fitting a numpy estimator, please specify dims as np.ndarray.

n_jobsOptional[int], default=None

Number of jobs to run in parallel.

topbool, default=True

Is this a top-level estimator? If multiple dims are specified, this will be False in recursive Sliding objects.

verbosebool, default=False

Should progress be reported verbosely?

estimators_List[Callable, sklearn.base.BaseEstimator]

List of fitted estimators.

Notes

When fitting estimators using fit(), X and y must have the same number of dimensions. If this is not the case, please pad or expand your data appropriately.

Examples

If, for example, we have \(X\) (n_trials, n_frequencies, n_channels, n_timepoints) and \(y\) (n_trials, n_frequencies, n_features, n_timepoints) and we want to slide a RidgeDecoder over (n_frequencies, n_timepoints), we can do:

>>> import torch
>>> from mvpy.estimators import Sliding, RidgeDecoder
>>> X = torch.normal(0, 1, (240, 5, 64, 100))
>>> y = torch.normal(0, 1, (240, 1, 5, 100))
>>> decoder = RidgeDecoder(
>>>     alphas = torch.logspace(-5, 10, 20)
>>> )
>>> sliding = Sliding(
>>>     estimator = decoder, 
>>>     dims = (1, 3), 
>>>     n_jobs = 4
>>> ).fit(X, y)
>>> patterns = sliding.collect('pattern_')
>>> patterns.shape
torch.Size([5, 100, 64, 5])
clone()[source]#

Clone this class.

Returns:
slidingmvpy.estimators.Sliding

Cloned class.

collect(attr: str) ndarray | Tensor[source]#

Collect an attribute from all estimators.

Parameters:
attrstr

Attribute to collect from all fitted estimators.

Returns:
attrnp.ndarray | torch.Tensor

Collected attribute of shape (*dims[, ...]).

fit(X: ndarray | Tensor, y: ndarray | Tensor, *args) Sliding[source]#

Fit the sliding estimators.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

ynp.ndarray | torch.Tensor

Target data of arbitrary shape.

*args

Additional arguments to pass to estimators.

Returns:
slidingmvpy.estimators.Sliding

The fitted sliding estimator.

fit_transform(X: ndarray | Tensor, y: ndarray | Tensor, *args) ndarray | Tensor[source]#

Fit and transform the data.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

yOptional[np.ndarray | torch.Tensor], default=None

Target data of arbitrary shape.

*argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

Transformed data of arbitrary shape.

predict(X: ndarray | Tensor, y: ndarray | Tensor | None = None, *args) ndarray | Tensor[source]#

Predict the targets.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

yOptional[np.ndarray | torch.Tensor], default=None

Target data of arbitrary shape.

*argsAny

Additional arguments.

Returns:
y_hnp.ndarray | torch.Tensor

Predicted data of arbitrary shape.

predict_proba(X: ndarray | Tensor, y: ndarray | Tensor | None = None, *args) ndarray | Tensor[source]#

Predict the probabilities.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

yOptional[np.ndarray | torch.Tensor], default=None

Target data of arbitrary shape.

*argsAny

Additional arguments.

Returns:
pnp.ndarray | torch.Tensor

Probabilities of arbitrary shape.

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

ynp.ndarray | torch.Tensor

Output data of arbitrary shape.

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to the metric specified for the underlying estimator.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape arbitrary shape.

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

transform(X: ndarray | Tensor, y: ndarray | Tensor | None = None, *args) ndarray | Tensor[source]#

Transform the data.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of arbitrary shape.

yOptional[np.ndarray | torch.Tensor], default=None

Target data of arbitrary shape.

*argsAny

Additional arguments.

Returns:
Znp.ndarray | torch.Tensor

Transformed data of arbitrary shape.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.svc module#

A collection of estimators for support vector classification.

class mvpy.estimators.svc.SVC(method: str = 'OvR', C: float = 1.0, kernel: str = 'linear', gamma: str | float = 'scale', coef0: float = 0.0, degree: float = 3.0, tol: float = 0.001, lr: float = 0.001, max_iter: int = 1000)[source]#

Bases: BaseEstimator

Implements a support vector classifier.

Support vector classifiers frame a classification problem mapping from neural data \(X\) to labels \(y\in\{1, -1\}\) as a max-margin problem:

\[f(X) = w^T\varphi(X) + b\]

that separates the classes with the largest possible margin in feature space \(\varphi(\cdot)\). As in KernelRidgeClassifier, \(\varphi(X)\) is a gram matrix defined by some kernel function. Contrary to KernelRidgeClassifier, however, SVC minimises a hinge-loss surrogate:

\[\arg\min_{w, b} \frac{1}{2}\lvert\lvert w\rvert\rvert^2 + C\sum_i\max\left(0, 1 - y_i f(X_i)\right)\]

Via the kernel trick, the decision function can be written in dual form as:

\[f(X) = \sum_{i\in\mathcal{S}} \alpha_i y_i \kappa(X_i, X) + b\]

where \(\alpha_i\ge 0\), and \(\kappa\) is a positive-definite kernel. Hyperparameters like the penalisation \(C\) are typically selected by cross-validation. Unlike KernelRidgeClassifier, penalty selection cannot be conveniently automated through LOO-CV here.

Compared to RidgeClassifier or KernelRidgeClassifier, SVC optimises a margin-based objective and often yields tighter decision boundaries, particularly when classes are not well separated linearly or when using non-linear kernel–at the cost of higher training time.

For more information on support vector classifiers, see [1].

Warning

SVC is currently considered experimental. As is, it uses gradient ascent over vectorised features and stops early when \(\Delta\lvert\lvert grad\rvert\rvert\) is smaller than some tolerance. This diverges from sklearn’s behaviour and may produce slightly degraded decision boundaries. In the future, we will be switching to an SMO routine that should resolve these issues.

Parameters:
method{‘OvR’, ‘OvO’}, default=’OvR’

For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?

Cfloat, default=1.0

Regularisation strength is inversely related to C.

kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’

Which kernel function should we use (linear, poly, rbf, sigmoid)?

gamma{‘scale’, ‘auto’, float}, default=’scale’

What gamma to use for poly, rbf and sigmoid. Available methods are scale or auto, or positive float.

coef0float, default=0.0

What offset to use for poly and sigmoid.

degreefloat, default=3.0

What degree polynomial to use (if any).

tolfloat, default=1e-3

Tolerance over maximum update step (i.e., when maximal gradient < tol, early stopping is triggered).

lrfloat, default=1e-3

The learning rate.

max_iterint, default=1000

The maximum number of iterations to perform while fitting, or -1 to disable.

Attributes:
method{‘OvR’, ‘OvO’}, default=’OvR’

For multiclass problems, which method should we use? One-versus-one (OvO) or one-versus-rest (OvR)?

Cfloat, default=1.0

Regularisation strength is inversely related to C.

kernel{‘linear’, ‘poly’, ‘rbf’, ‘sigmoid’}, default=’linear’

Which kernel function should we use (linear, poly, rbf, sigmoid)?

gamma{‘scale’, ‘auto’, float}, default=’scale’

What gamma to use for poly, rbf and sigmoid. Available methods are scale or auto, or positive float.

coef0float, default=0.0

What offset to use for poly and sigmoid.

degreefloat, default=3.0

What degree polynomial to use (if any).

tolfloat, default=1e-3

Tolerance over maximum update step (i.e., when maximal gradient < tol, early stopping is triggered).

lrfloat, default=1e-3

The learning rate.

max_iterint, default=1000

The maximum number of iterations to perform while fitting, or -1 to disable.

X_train_np.ndarray | torch.Tensor

A clone of the training data used internally for kernel estimation.

A_np.ndarray | torch.Tensor

A clone of the alpha data used internally for kernel estimation.

gamma_float

Estimated gamma parameter.

eps_float, default=1e-12

Error margin for support vectors used internally.

w_np.ndarray | torch.Tensor

If linear kernel, estimated weights.

p_np.ndarray | torch.Tensor

If linear kernel, estimated patterns.

intercept_np.ndarray | torch.Tensor

The intercept vector.

coef_np.ndarray | torch.Tensor

If kernel is linear, the coefficients of the model.

pattern_np.ndarray | torch.Tensor

If kernel is linear, the patterns used by the model.

binariser_mvpy.preprocessing.LabelBinariser

The binariser used internally.

scaler_mvpy.preprocessing.Scaler

The scaler used internally.

metric_mvpy.metrics.accuracy

The default metric to use.

Notes

Coefficients are interpretable only when kernel is linear. In this case, patterns are computed as per [2].

References

[1]

Awad, M., & Khanna, R. (2015). Support vector machines for classification. Efficient Learning Machines, 39-66. 10.1007/F978-1-4302-5990-9_3

[2]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

First, let’s look at a case where we have one feature that has two classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import make_circles
>>> X, y = make_circles(noise = 0.3)
>>> X, y = torch.from_numpy(X).float(), torch.from_numpy(y).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.squeeze(), y)
tensor(0.6700)

Second, let’s look at a case where we have one feature that has three classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import load_iris
>>> X, y = load_iris(return_X_y = True)
>>> X, y = torch.from_numpy(X).float(), torch.from_numpy(y).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.squeeze(), y)
tensor(0.9733)

Third, let’s look at a case where we have two features with a variable number of classes.

>>> import torch
>>> from mvpy.estimators import SVC
>>> from sklearn.datasets import make_classification
>>> X0, y0 = make_classification(n_classes = 3, n_informative = 6)
>>> X1, y1 = make_classification(n_classes = 4, n_informative = 8)
>>> X = torch.from_numpy(np.concatenate((X0, X1), axis = -1)).float()
>>> y = torch.from_numpy(np.stack((y0, y1), axis = -1)).float()
>>> clf = SVC(kernel = 'rbf').fit(X, y)
>>> y_h = clf.predict(X)
>>> mv.math.accuracy(y_h.T, y.T)
tensor([1.000, 0.9800])
clone() SVC[source]#

Clone this class.

Returns:
svcmvpy.estimators.SVC

The cloned object.

copy() SVC[source]#

Clone this class.

Returns:
svcmvpy.estimators.SVC

The cloned object.

decision_function(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

Returns:
dfnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

fit(X: ndarray | Tensor, y: ndarray | Tensor) BaseEstimator[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

The targets of shape (n_samples[, n_features]).

Returns:
clfmvpy.estimators.SVC

The classifier.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features of shape (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_features).

predict_proba(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

The features (n_samples, n_channels).

Returns:
pnp.ndarray | torch.Tensor

The predictions of shape (n_samples, n_classes).

Warning

Probabilities are computed from expit() over outputs of decision_function(). Consequently, probability estimates returned by this class are not calibrated. See Classifier for more information.

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray], Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

to_numpy() BaseEstimator[source]#

Obtain the estimator with numpy as backend.

Returns:
svcmvpy.estimators._SVC_numpy

The estimator.

to_torch() BaseEstimator[source]#

Obtain the estimator with torch as backend.

Returns:
svcmvpy.estimators._SVC_torch

The estimator.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

mvpy.estimators.timedelayed module#

A collection of estimators for TimeDelayed modeling (mTRF + SR).

class mvpy.estimators.timedelayed.TimeDelayed(t_min: float, t_max: float, fs: int, alphas: ndarray | Tensor = tensor([1]), patterns: bool = False, **kwargs)[source]#

Bases: BaseEstimator

Implements time delayed ridge regression (for multivariate temporal response functions or stimulus reconstruction).

Generally, mTRF models are described by:

\[r(t,n) = \sum_{\tau} w(\tau, n) s(t - \tau) + \varepsilon\]

where \(r(t,n)\) is the reconstructed signal at timepoint \(t\) for channel \(n\), \(s(t)\) is the stimulus at time \(t\), \(w(\tau, n)\) is the weight at time delay \(\tau\) for channel \(n\), and \(\varepsilon\) is the error.

SR models are estimated as:

\[s(t) = \sum_{n}\sum_{\tau} r(t + \tau, n) g(\tau, n)\]

where \(s(t)\) is the reconstructed stimulus at time \(t\), \(r(t,n)\) is the neural response at \(t\) and lagged by \(\tau\) for channel \(n\), \(g(\tau, n)\) is the weight at time delay \(\tau\) for channel \(n\).

For more information on mTRF or SR models, see [1].

In both cases, models are constructed by temporally expanding the design matrix and outcome matrix and then solving for the regression problem:

\[y = \beta X + \varepsilon\]

Consequently, we solve for coefficients through:

\[\arg\min_{\beta} \sum_{i} (y_i - \beta^T X_i)^2 + \alpha_\beta \lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are the penalties to test in LOO-CV. Therefore, this class is functionally equivalent to ReceptiveField, but solves the problem through ridge regression rather than auto- and cross-correlations in the Fourier domain. For more information on this, see ReceptiveField.

Parameters:
t_minfloat

The minimum time delay. Note that positive values indicate X is delayed relative to y. This is unlike MNE’s behaviour.

t_maxfloat

The maximum time delay. Note that positive values indicate X is delayed relative to y. This is unlike MNE’s behaviour.

fsint

The sampling frequency.

alphasnp.ndarray | torch.Tensor, default=torch.tensor([1])

The penalties to use for estimation.

patternsbool, default=False

Should patterns be estimated?

kwargsAny

Additional arguments for the estimator.

Attributes:
alphasnp.ndarray | torch.Tensor

The penalties to use for estimation.

kwargsAny

Additional arguments.

patternsbool

Should patterns be estimated?

t_minfloat

The minimum time delay. Note that positive values indicate X is delayed relative to y. This is unlike MNE’s behaviour.

t_maxfloat

The maximum time delay. Note that positive values indicate X is delayed relative to y. This is unlike MNE’s behaviour.

fsint

The sampling frequency.

windownp.ndarray | torch.Tensor

The window to use for estimation.

estimatormvpy.estimators.RidgeCV

The estimator to use.

f_int

The number of output features.

c_int

The number of input features.

w_int

The number of time delays.

intercept_np.ndarray | torch.Tensor

The intercepts of the estimator.

coef_np.ndarray | torch.Tensor

The coefficients of the estimator.

pattern_np.ndarray | torch.Tensor

The patterns of the estimator.

metric_mvpy.metrics.r2

The default metric to use.

See also

mvpy.estimators.ReceptiveField

An alternative mTRF/SR estimator that solves through auto- and cross-correlations in the Fourier domain.

Notes

For SR models it is recommended to also pass patterns True to estimate not only the coefficients but also the patterns that were actually used for reconstructing stimuli. For more information, see [2].

References

[1]

Crosse, M.J., Di Liberto, G.M., Bednar, A., & Lalor, E.C. (2016). The multivariate temporal response function (mTRF) toolbox: A MATLAB toolbox for relating neural signals to continuous stimuli. Frontiers in Human Neuroscience, 10, 604. 10.3389/fnhum.2016.00604

[2]

Haufe, S., Meinecke, F., Görgen, K., Dähne, S., Haynes, J.D., Blankertz, B., & Bießmann, F. (2014). On the interpretation of weight vectors of linear models in multivariate neuroimaging. NeuroImage, 87, 96-110. 10.1016/j.neuroimage.2013.10.067

Examples

For mTRF estimation, we can do:

>>> import torch
>>> from mvpy.estimators import TimeDelayed
>>> ß = torch.tensor([1., 2., 3., 2., 1.])
>>> X = torch.normal(0, 1, (100, 1, 50))
>>> y = torch.nn.functional.conv1d(X, ß[None,None,:], padding = 'same')
>>> y = y + torch.normal(0, 1, y.shape)
>>> trf = TimeDelayed(-2, 2, 1, alphas = 1e-5)
>>> trf.fit(X, y).coef_
tensor([[[0.9290, 1.9101, 2.8802, 1.9790, 0.9453]]])

For stimulus reconstruction, we can do:

>>> import torch
>>> from mvpy.estimators import TimeDelayed
>>> ß = torch.tensor([1., 2., 3., 2., 1.])
>>> X = torch.arange(50)[None,None,:] * torch.ones((100, 1, 50))
>>> y = torch.nn.functional.conv1d(X, ß[None,None,:], padding = 'same')
>>> y = y + torch.normal(0, 1, y.shape)
>>> X, y = y, X
>>> sr = TimeDelayed(-2, 2, 1, alphas = 1e-3, patterns = True).fit(X, y)
>>> sr.predict(X).mean(0)[0,:]
tensor([ 1.3591,  1.2549,  1.5662,  2.3544,  3.3440,  4.3683,  5.4097,  6.4418, 
         7.4454,  8.4978,  9.5206, 10.5374, 11.5841, 12.6102, 13.6254, 14.6939, 
         15.6932, 16.7168, 17.7619, 18.8130, 19.8182, 20.8687, 21.8854, 22.9310, 
         23.9270, 24.9808, 26.0085, 27.0347, 28.0728, 29.0828, 30.1400, 31.1452, 
         32.1793, 33.2047, 34.2332, 35.2717, 36.2945, 37.3491, 38.3800, 39.3817, 
         40.3962, 41.4489, 42.4854, 43.4965, 44.5346, 45.5716, 46.7301, 47.2251, 
         48.4449, 48.8793])
clone() TimeDelayed[source]#

Clone this class.

Returns:
tdTimeDelayed

The cloned object.

fit(X: ndarray | Tensor, y: ndarray | Tensor)[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

ynp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels, n_timepoints).

Returns:
tdmvpy.estimators._TimeDelayed_numpy | mvpy.estimators._TimeDelayed_torch

The fitted TimeDelayed estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Make predictions from model.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

Returns:
y_hnp.ndarray | torch.Tensor

Predicted responses of shape (n_samples, n_channels, n_timepoints).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_features, n_timepoints).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_channels, n_timepoints).

metricOptional[Metric | Tuple[Metric]], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_channels, n_timepoints) or, for multiple metrics, a dictionary of metric names and scores of shape (n_channels, n_timepoints).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.

Configure global settings and get information about the working environment.

The torch package contains data structures for multi-dimensional

Module contents#

A collection of estimators for decoding and disentangling features using back2back regression.

A collection of estimators for decoding features using ridge classifiers.

A collection of estimators for covariance estimation and

A collection of estimators for common spatial patterns.

A collection of estimators for fitting cross-validated ridge regressions.

A collection of estimators for ReceptiveField modeling (mTRF + SR).

A collection of estimators for ridge classification.

A collection of estimators for fitting cross-validated ridge regressions.

A collection of estimators for decoding features using ridge decoders.

A collection of estimators for encoding features using ridge regressions.

A collection of estimators for computing

A collection of estimators that allow for sliding other estimators over a dimension of the data.

A collection of estimators for support vector classification.

A collection of estimators for TimeDelayed modeling (mTRF + SR).