Covariance#

class mvpy.estimators.Covariance(method: str = 'ledoitwolf', s_min: float | None = None, s_max: float | None = None)[source]#

Implements covariance and precision estimation as well as whitening of data.

For covariance estimation, three methods are currently available through method:

  1. empirical

    This computes the empirical (sample) covariance matrix:

    \[\Sigma = \mathbb{E}\left[(X - \mathbb{E}[X])(X^T - \mathbb{E}[X^T])\right]\]

    This is computationally efficient, but produces estimates of the covariance \(\Sigma\) that may often be unfavourable: Given small datasets or noisy measurements, \(\Sigma\) may be ill- conditioned and not positive-definite with eigenvalues that tend to be systematically pushed towards the tails. In practice, this can make inversion challenging and hurts out-of-sample generalisation.

  2. ledoitwolf

    This computes the LedoitWolf shrinkage estimator:

    \[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]

    where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage intensity that minimises the the expected Frobenius-norm risk:

    \[\hat\delta = \min\left\{1, \max\left\{0, \frac{\hat\pi}{\hat\rho}\right\}\right\},\qquad \hat\rho = \lvert\lvert\Sigma - T\rvert\rvert_F^2,\qquad \hat\pi = \frac{1}{n}\sum_{k=1}^{n}\lvert\lvert x_k x_k^T - \Sigma\rvert\rvert_F^2\]

    and where:

    \[T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]

    This produces estimates that are well-conditioned and positive-definite. For more information on this procedure, please see [1].

  3. oas

    This computes the oracle approximating shrinkage estimator:

    \[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]

    where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage:

    \[\hat\delta = \frac{(1 - \frac{2}{p}) \textrm{tr}(\Sigma^2) + \textrm{tr}(\Sigma)^2}{(n + 1 - \frac{2}{p})\left(\textrm{tr}(\Sigma^2) - \frac{\textrm{tr}(\Sigma)^2}{p}\right)},\qquad T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]

    Like ledoitwolf, this procedure produces estimates that are well-conditioned and positive-definite. Contrary to ledoitwolf, shrinkage tends to be more aggressive in this procedure. For more information, please see [2].

When calling transform on this class, data will automatically be whitened based on the estimated covariance matrix. The whitening matrix is computed from the eigendecomposition as follows:

\[\Sigma = Q\Lambda Q^T,\qquad \Lambda = \textrm{diag}(\lambda_1, ..., \lambda_p) \geq 0,\qquad W = Q\Lambda^{-\frac{1}{2}}Q^T\]

For more information on whitening, refer to [3].

Parameters:
method{‘empirical’, ‘ledoitwolf’, ‘oas’}, default = ‘ledoitwolf’

Which method should be applied for estimation of covariance?

s_minfloat, default = None

What’s the minimum sample we should consider in the time dimension?

s_maxfloat, default = None

What’s the maximum sample we should consider in the time dimension?

Attributes:
covariance_np.ndarray | torch.Tensor

Covariance matrix

precision_np.ndarray | torch.Tensor

Precision matrix (inverse of covariance matrix)

whitener_np.ndarray | torch.Tensor

Whitening matrix

shrinkage_float, default=None

Shrinkage parameter, if used by method.

Notes

This class assumes features to be the second to last dimension of the data, unless there are only two dimensions (in which case it is assumed to be the last dimension).

References

[1]

Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365-411. 10.1016/S0047-259X(03)00096-4

[2]

Chen, Y., Wiesel, A., Eldar, Y.C., & Hero, A.O. (2009). Shrinkage algorithms for MMSE covariance estimation. arXiv. 10.48550/arXiv.0907.4698

[3]

Kessy, A., Lewin, A., & Strimmer, K. (2016). Optimal whitening and decorrelation. arXiv. 10.48550/arXiv.1512.00809

Examples

>>> import torch
>>> from mvpy.estimators import Covariance
>>> X = torch.normal(0, 1, (100, 10, 100))
>>> cov = Covariance(s_max = 20).fit(X)
>>> cov.covariance_.shape
torch.Size([10, 10])
clone() Covariance[source]#

Obtain a clone of this class.

Returns:
covmvpy.estimators.Covariance

The cloned object.

fit(X: ndarray | Tensor, *args: Any) Covariance[source]#

Fit the covariance estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Data to fit the estimator on of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
selfCovariance

Fitted covariance estimator.

fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Fit the covariance estimator and whiten the data.

Parameters:
Xnp.ndarray | torch.Tensor

Data to fit the estimator on and transform of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
Wnp.ndarray | torch.Tensor

Whitened data of shape (n_trials, n_features[, n_timepoints]).

to_numpy() BaseEstimator[source]#

Create the numpy estimator. Note that this function cannot be used for conversion.

Returns:
covmvpy.estimators.Covariance

The numpy estimator.

to_torch() BaseEstimator[source]#

Create the torch estimator. Note that this function cannot be used for conversion.

Returns:
covmvpy.estimators.Covariance

The torch estimator.

transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor[source]#

Whiten data using the fitted covariance estimator.

Parameters:
Xnp.ndarray | torch.Tensor

Data to transform of shape (n_trials, n_features[, n_timepoints]).

*argsAny

Additional arguments to pass to the estimator.

Returns:
Wnp.ndarray | torch.Tensor

Whitened data of shape (n_trials, n_features[, n_timepoints]).