Covariance#
- class mvpy.estimators.Covariance(method: str = 'ledoitwolf', s_min: float | None = None, s_max: float | None = None)[source]#
Implements covariance and precision estimation as well as whitening of data.
For covariance estimation, three methods are currently available through
method
:empirical
This computes the empirical (sample) covariance matrix:
\[\Sigma = \mathbb{E}\left[(X - \mathbb{E}[X])(X^T - \mathbb{E}[X^T])\right]\]This is computationally efficient, but produces estimates of the covariance \(\Sigma\) that may often be unfavourable: Given small datasets or noisy measurements, \(\Sigma\) may be ill- conditioned and not positive-definite with eigenvalues that tend to be systematically pushed towards the tails. In practice, this can make inversion challenging and hurts out-of-sample generalisation.
ledoitwolf
This computes the LedoitWolf shrinkage estimator:
\[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage intensity that minimises the the expected Frobenius-norm risk:
\[\hat\delta = \min\left\{1, \max\left\{0, \frac{\hat\pi}{\hat\rho}\right\}\right\},\qquad \hat\rho = \lvert\lvert\Sigma - T\rvert\rvert_F^2,\qquad \hat\pi = \frac{1}{n}\sum_{k=1}^{n}\lvert\lvert x_k x_k^T - \Sigma\rvert\rvert_F^2\]and where:
\[T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]This produces estimates that are well-conditioned and positive-definite. For more information on this procedure, please see [1].
oas
This computes the oracle approximating shrinkage estimator:
\[\hat\Sigma = (1 - \hat{\delta})\Sigma + \hat\delta T\]where \(\hat{\delta}\in[0, 1]\) is the data-driven shrinkage:
\[\hat\delta = \frac{(1 - \frac{2}{p}) \textrm{tr}(\Sigma^2) + \textrm{tr}(\Sigma)^2}{(n + 1 - \frac{2}{p})\left(\textrm{tr}(\Sigma^2) - \frac{\textrm{tr}(\Sigma)^2}{p}\right)},\qquad T = \mu I_p,\qquad \mu = \frac{1}{p}\textrm{tr}(\Sigma)\]Like
ledoitwolf
, this procedure produces estimates that are well-conditioned and positive-definite. Contrary toledoitwolf
, shrinkage tends to be more aggressive in this procedure. For more information, please see [2].
When calling transform on this class, data will automatically be whitened based on the estimated covariance matrix. The whitening matrix is computed from the eigendecomposition as follows:
\[\Sigma = Q\Lambda Q^T,\qquad \Lambda = \textrm{diag}(\lambda_1, ..., \lambda_p) \geq 0,\qquad W = Q\Lambda^{-\frac{1}{2}}Q^T\]For more information on whitening, refer to [3].
- Parameters:
- method{‘empirical’, ‘ledoitwolf’, ‘oas’}, default = ‘ledoitwolf’
Which method should be applied for estimation of covariance?
- s_minfloat, default = None
What’s the minimum sample we should consider in the time dimension?
- s_maxfloat, default = None
What’s the maximum sample we should consider in the time dimension?
- Attributes:
- covariance_np.ndarray | torch.Tensor
Covariance matrix
- precision_np.ndarray | torch.Tensor
Precision matrix (inverse of covariance matrix)
- whitener_np.ndarray | torch.Tensor
Whitening matrix
- shrinkage_float, default=None
Shrinkage parameter, if used by method.
Notes
This class assumes features to be the second to last dimension of the data, unless there are only two dimensions (in which case it is assumed to be the last dimension).
References
[1]Ledoit, O., & Wolf, M. (2004). A well-conditioned estimator for large-dimensional covariance matrices. Journal of Multivariate Analysis, 88, 365-411. 10.1016/S0047-259X(03)00096-4
[2]Chen, Y., Wiesel, A., Eldar, Y.C., & Hero, A.O. (2009). Shrinkage algorithms for MMSE covariance estimation. arXiv. 10.48550/arXiv.0907.4698
[3]Kessy, A., Lewin, A., & Strimmer, K. (2016). Optimal whitening and decorrelation. arXiv. 10.48550/arXiv.1512.00809
Examples
>>> import torch >>> from mvpy.estimators import Covariance >>> X = torch.normal(0, 1, (100, 10, 100)) >>> cov = Covariance(s_max = 20).fit(X) >>> cov.covariance_.shape torch.Size([10, 10])
- clone() Covariance [source]#
Obtain a clone of this class.
- Returns:
- covmvpy.estimators.Covariance
The cloned object.
- fit(X: ndarray | Tensor, *args: Any) Covariance [source]#
Fit the covariance estimator.
- Parameters:
- Xnp.ndarray | torch.Tensor
Data to fit the estimator on of shape
(n_trials, n_features[, n_timepoints])
.- *argsAny
Additional arguments to pass to the estimator.
- Returns:
- selfCovariance
Fitted covariance estimator.
- fit_transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor [source]#
Fit the covariance estimator and whiten the data.
- Parameters:
- Xnp.ndarray | torch.Tensor
Data to fit the estimator on and transform of shape
(n_trials, n_features[, n_timepoints])
.- *argsAny
Additional arguments to pass to the estimator.
- Returns:
- Wnp.ndarray | torch.Tensor
Whitened data of shape
(n_trials, n_features[, n_timepoints])
.
- to_numpy() BaseEstimator [source]#
Create the numpy estimator. Note that this function cannot be used for conversion.
- Returns:
- covmvpy.estimators.Covariance
The numpy estimator.
- to_torch() BaseEstimator [source]#
Create the torch estimator. Note that this function cannot be used for conversion.
- Returns:
- covmvpy.estimators.Covariance
The torch estimator.
- transform(X: ndarray | Tensor, *args: Any) ndarray | Tensor [source]#
Whiten data using the fitted covariance estimator.
- Parameters:
- Xnp.ndarray | torch.Tensor
Data to transform of shape
(n_trials, n_features[, n_timepoints])
.- *argsAny
Additional arguments to pass to the estimator.
- Returns:
- Wnp.ndarray | torch.Tensor
Whitened data of shape
(n_trials, n_features[, n_timepoints])
.