hierarchical_score#
- mvpy.model_selection.hierarchical_score(model: Pipeline | BaseEstimator, X: ndarray | Tensor, y: ndarray | Tensor, groups: List | ndarray | Tensor | None = None, dim: int | None = None, cv: int | Any = 5, metric: Metric | Tuple[Metric] | None = None, return_hierarchical: bool = True, n_jobs: int | None = None, n_jobs_validator: int | None = None, verbose: int | bool = False, verbose_validator: int | bool = False) ndarray | Tensor | Dict | Tuple[Hierarchical, ndarray, Tensor, Dict][source]#
Implements a shorthand for hierarchical scoring over all feature permutations in \(X\) describing \(y\).
This function acts as a shorthand for
Hierarchicalwhere it will automatically create and fit all permutations of the predictors specified in \(X\) following a hierarchical procedure. Returns either only the output scores or, ifreturn_hierarchicalisTrue, both the fitted hierarchical object and the scores in a tuple.For more information, please consult
Hierarchical.Warning
This performs \(k\left(2^p - 1\right)\) individual model fits where \(k\) is the total number of cross-validation steps and \(p\) is the number of unique groups of predictors. For large \(p\), this becomes exponentially more expensive to solve. If you are interested in the unique contribution of each feature rather than separate estimates for all combinations, consider using
shapley_scoreinstead.- Parameters:
- modelsklearn.pipeline.Pipeline | sklearn.base.BaseEstimator
The model to fit and score. Can be either a pipeline or estimator object.
- Xnp.ndarray | torch.Tensor
The input data of arbitrary shape.
- ynp.ndarray | torch.Tensor
The output data of arbitrary shape.
- groupsOptional[List | np.ndarray | torch.Tensor], default=None
Matrix describing all groups of interest of shape
(n_groups, n_predictors). IfNone, this will default to the identity matrix(n_predictors, n_predictors).- dimOptional[int], default=None
The dimension in \(X\) that describes the predictors. If
None, this will assume-1for 2D data and-2otherwise.- cvint | Any, default=5
The cross-validation procedure to follow. Either an object exposing a
split()method, such as :py:class`~mvpy.crossvalidation.KFold`, or an integer specifying the number of folds to use inKFold.- metricOptional[mvpy.metrics.Metric, Tuple[mvpy.metrics.Metric]], default=None
The metric to use for scoring. If
None, this will default to thescore()method exposed bymodel.- return_hierarchicalbool, default=True
Should the underlying
Hierarchicalobject be returned?- n_jobsOptional[int], default=None
How many jobs should be used to parallelise the hierarchical fitting procedure?
- n_jobs_validatorOptional[int], default=None
How many jobs should be used to parallelise the cross-validation procedure?
- verboseint | bool, default=False
Should progress be reported verbosely?
- verbose_validatorint | bool, default=False
Should progress in individual
Validatorobjects be reported verbosely?
- Returns:
- hierarchicalOptional[mvpy.model_selection.Hierarchical]
If
return_hierarchicalisTrue, the underlyingHierarchicalobject.- scorenp.ndarray | torch.Tensor | Dict[str, np.ndarray] | Dict[str, torch.Tensor]
The all hierarchical scores of shape
(n_sets, n_cv[, ...])or a dictionary containing each individualMetric.
See also
mvpy.model_selection.shapley_score,mvpy.model_selection.ShapleyAn alternative scoring method computing unique contributions of each feature rather than the full permutation.
mvpy.model_selection.HierarchicalThe underlying hierarchical scoring object.
mvpy.crossvalidation.ValidatorThe cross-validation objects used in
Hierarchical.
Notes
Currently this does not automatically select the best model for you. Instead, it will return all scores, leaving further decisions up to you. This is because, for most applications, the scores of all permutations are actually of interest and may need to be reported.
Warning
If multiple values are supplied for
metric, this function will output a dictionary of{Metric.name: score, ...}rather than a stacked array. This is to provide consistency across cases where metrics may or may not differ in their output shapes.Warning
When specifying
n_jobshere, be careful not to specify any number of jobs in the model or anyn_jobs_validator. Otherwise, this will lead to a situation where individual jobs each try to initialise more low-level jobs, severely hurting performance.Examples
>>> import torch >>> from mvpy import metrics >>> from mvpy.dataset import make_meeg_continuous >>> from mvpy.preprocessing import Scaler >>> from mvpy.estimators import TimeDelayed >>> from mvpy.model_selection import hierarchical_score >>> from sklearn.pipeline import make_pipeline >>> # create dataset >>> fs = 200 >>> X, y = make_meeg_continuous(fs = fs, n_features = 5) >>> # setup pipeline for estimation of multivariate temporal response functions >>> trf = make_pipeline( >>> Scaler().to_torch(), >>> TimeDelayed( >>> -1.0, 0.0, fs, >>> alphas = torch.logspace(-5, 5, 10, device = device) >>> ) >>> ) >>> # setup groups of predictors >>> groups = torch.tensor( >>> [ >>> [1, 1, 1, 0, 0], >>> [1, 1, 1, 1, 0], >>> [1, 1, 1, 0, 1] >>> ], >>> dtype = torch.long, >>> device = device >>> ) >>> # score predictors hierarchically >>> hierarchical, score = hierarchical_score( >>> trf, X, y, >>> groups = groups, >>> metric = (metrics.r2, metrics.pearsonr) >>> verbose = True >>> ) >>> score['r2'].shape torch.size([4, 5, 64, 400])