RidgeCV#

class mvpy.estimators.RidgeCV(alphas: ndarray | Tensor | list | float | int = 1, fit_intercept: bool = True, normalise: bool = True, alpha_per_target: bool = False)[source]#

Implements ridge regression with cross-validation.

Ridge regression maps input data \(X\) to output data \(y\) through coefficients \(\beta\):

\[y = \beta X + \varepsilon\]

and solves for the model \(\beta\) through:

\[\arg\min_\beta \sum_i (y_i - \beta^T X_i)^2 + \alpha_\beta\lvert\lvert\beta\rvert\rvert^2\]

where \(\alpha_\beta\) are penalties to test in LOO-CV which has a convenient closed-form solution here:

\[\arg\min_{\alpha_\beta} \frac{1}{N}\sum_{i = 1}^{N} \left(\frac{y - \beta_\alpha X}{1 - H_{\alpha,ii}}\right)\qquad \textrm{where}\qquad H_{\alpha,ii} = \textrm{diag}\left(X(X^T X + \alpha I)^{-1}X^T\right)\]

As such, this will automatically evaluate the LOO-CV of all values of alphas and chose the penalty that minimises the mean-squared loss. This is convenient because it is much faster than performing inner cross-validation to fine-tune penalties.

For more information on ridge regression, see [1]. This implementation follows [2].

Parameters:
alphasnp.ndarray | torch.Tensor | List | float | int, default=1

Penalties to use for estimation.

fit_interceptbool, default=True

Whether to fit an intercept.

normalisebool, default=True

Whether to normalise the data.

alpha_per_targetbool, default=True

Whether to use a different penalty for each target.

Attributes:
alpha_np.ndarray | torch.Tensor

The penalties used for estimation.

intercept_np.ndarray | torch.Tensor

The intercepts of shape (n_features,).

coef_np.ndarray | torch.Tensor

The coefficients of shape (n_channels, n_features).

metric_mvpy.metrics.r2

The default metric to use.

Notes

If data are supplied as numpy, this class will fall back to sklearn.linear_model.RidgeCV. See [3].

References

[1]

McDonald, G.C. (2009). Ridge regression. Wiley Interdisciplinary Reviews: Computational Statistics, 1, 93-100. doi.org/10.1002/wics.14

[2]

King, J.R. (2020). torch_ridge. kingjr/torch_ridge

[3]

Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., … & Vanderplas, J. (2011). Scikit-learn: Machine learning in Python. Journal of Machine Learning Research, 12, 2825-2830.

Examples

>>> import torch
>>> from mvpy.estimators import RidgeCV
>>> ß = torch.normal(0, 1, size = (5,))
>>> X = torch.normal(0, 1, size = (240, 5))
>>> y = ß @ X.T + torch.normal(0, 0.5, size = (X.shape[0],))
>>> model = RidgeCV().fit(X, y)
>>> model.coef_
clone() RidgeCV[source]#

Make a clone of this class.

Returns:
ridgeRidgeCV

A clone of this class.

fit(X: ndarray | Tensor, y: ndarray | Tensor) RidgeCV[source]#

Fit the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

ynp.ndarray | torch.Tensor

Output data of shape (n_samples, n_features).

Returns:
ridgeRidgeCV

The fitted ridge estimator.

predict(X: ndarray | Tensor) ndarray | Tensor[source]#

Predict from the estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Input data of shape (n_samples, n_channels).

Returns:
y_hnp.ndarray | torch.Tensor

Predicted data of shape (n_samples, n_features).

score(X: ndarray | Tensor, y: ndarray | Tensor, metric: Metric | Tuple[Metric] | None = None) ndarray | Tensor | Dict[str, ndarray] | Dict[str, Tensor][source]#

Make predictions from \(X\) and score against \(y\).

Parameters:
Xtorch.Tensor

Input data of shape (n_samples, n_channels).

ytorch.Tensor

Output data of shape (n_samples, n_features).

metricOptional[Metric], default=None

Metric or tuple of metrics to compute. If None, defaults to metric_.

Returns:
scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]

Scores of shape (n_features,) or, for multiple metrics, a dictionary of metric names and scores of shape (n_features,).

Warning

If multiple values are supplied for metric, this function will output a dictionary of {Metric.name: score, ...} rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.