Model Set Selection (MSS)

Updated 18 November 2025

Model Set Selection (MSS) is a framework that identifies collections of near-optimal models to capture uncertainty and acknowledge multiple plausible explanations.
It employs methodologies such as penalized regression, likelihood-ratio testing, and classifier-based approaches to construct confidence sets for model selection.
MSS is applied in diverse fields like materials science, genomics, and decision analysis to improve interpretability and ensure robust decision-making.

Model Set Selection (MSS) refers to the systematic identification of a collection of models or algorithms, rather than a single best model, which are all considered plausible or near-optimal according to a relevant criterion. MSS arises in statistical inference, machine learning, and decision analysis whenever the data and modeling context support multiple explanations, predictive patterns, or solution procedures with comparable empirical performance. The primary motivation behind MSS is to acknowledge and quantify model uncertainty, enable scientific exploration of alternative hypotheses, and provide rigorous guarantees on the set of models reported.

1. Conceptual Framework and Motivation

Traditional model selection methods output a single model $\hat{h}$ or a point estimate of structure (e.g., model order, variable subset), based on criteria such as maximum likelihood, minimum risk, AIC/BIC, or cross-validation (Cecil et al., 14 Nov 2025). However, in the presence of high model multiplicity—sometimes called the “Rashomon effect”—distinct models may yield near-identical empirical performance, particularly when covariate correlation, noise, or low signal-to-noise ratios occur (Cecil et al., 14 Nov 2025). MSS generalizes the selection framework: it explicitly constructs a set of models, each well supported by the data, and quantifies the degree of model selection ambiguity (Wendelberger et al., 2020, Cecil et al., 14 Nov 2025).

Formally, let $\mathcal{H}$ be a model class, $\ell(h, z)$ a loss on example $z\in\mathcal{Z}$ , and $f_j$ a fitted candidate model. For a given tolerance $\epsilon\geq 0$ , the $\epsilon$ -Rashomon set is

$\Theta_{\epsilon, n}^{\rm MSS} = \left\{ r \in [d]: R_n(f_r) \leq \min_{j\in[d]} R_n(f_j) + \epsilon \right\},$

where $R_n(f)$ denotes an empirical risk or multi-fold test error (Cecil et al., 14 Nov 2025). The goal is to estimate, or provide a confidence set for, $\Theta_{\epsilon, n}^{\rm MSS}$ .

2. Methodological Implementations

A wide spectrum of MSS methodologies exist, adapted to context and model structure:

Multi-Model Penalized Regression (MMPR): For linear regression, MMPR solves an optimization that explicitly finds $k$ coefficient vectors $\{\beta^{(1)},\ldots,\beta^{(k)}\}$ , minimizing predictive loss plus sparsity penalties and explicit diversity penalties to enforce dissimilarity among models (Wendelberger et al., 2020).
Model Selection Confidence Sets (MSCS): In both order selection and variable selection, MSCS uses likelihood-ratio testing to construct a set of models or orders statistically indistinguishable at a specified confidence level from the reference model (Casa et al., 24 Mar 2025, Zheng et al., 2017). For example, in Gaussian mixtures, MSCS identifies all plausible model orders $k$ such that the penalized log-likelihood ratio relative to the reference order does not exceed a critical value (Casa et al., 24 Mar 2025).
Feature-Based Model Set Selection: In time series exponential smoothing (ETS), classifiers are trained on simulated labeled data to predict optimal model components for each series, forming a set of plausible component triplets for low computational overhead (Qi et al., 2022).
Rule-Based MSS in Decision Analysis: For the Multiple Criteria Decision Analysis (MCDA) domain, MSS matches an input vector of activated problem features to a database of MCDA methods, returning the set of methods fully compatible with specified requirements, or the minimally violating alternatives if no perfect fit exists (Cinelli et al., 2021).

The solution strategies generally share the principle of scoring/ranking/culling model candidates based on fit, compatibility, plausibility, or confidence, and providing justified mechanisms for set construction.

3. Formal Criteria, Algorithms, and Theoretical Guarantees

The construction of a model set typically adheres to rigorous statistical or algorithmic principles:

Penalized Optimization: For MMPR, the joint objective is

$\min_{\beta^{(1)}, \dots, \beta^{(k)}} \sum_{m=1}^k \|y-X\beta^{(m)}\|_2^2 + \lambda_1 \sum_{m=1}^k P_{\mathrm{sparse}}(\beta^{(m)}) + \lambda_2 \sum_{1\leq m<m'\leq k} D(\beta^{(m)}, \beta^{(m')}),$

where $P_{\mathrm{sparse}}$ (e.g., LASSO or ridge) encourages parsimony and $D(\cdot, \cdot)$ enforces diversity (Wendelberger et al., 2020).

Likelihood-Ratio Testing and Coverage: In MSCS, a model $\gamma$ enters the set if $2\{\ell_n(\hat{\theta}_{\gamma_f})-\ell_n(\hat{\theta}_\gamma)\} \leq q(\alpha;d)$ , guaranteeing asymptotic coverage of the true model at level $1-\alpha$ under regularity (Zheng et al., 2017, Casa et al., 24 Mar 2025).
Data-Splitting MSS Tests: Hold-out or cross-validation based tests (e.g., studentized CLT or universal inference) yield confidence sets by assessing loss differences over random splits, and building selection via hypothesis-testing controls on the Type I error rate (Cecil et al., 14 Nov 2025).
Gap Analysis and Prioritization: Rule-based selection in decision support systems iteratively narrows the model set via feature mismatch counts, compatibility scoring, and “most selective question” guidance until a suitably small and well-justified model subset remains (Cinelli et al., 2021).

A summary table describing major MSS paradigms in the literature:

Domain	MSS Methodology	Key Principle/Guarantee
Linear regression	MMPR ( $k$ -model penalized regression)	Explicit diversity penalty, coordinate descent
Gaussian mixtures	MSCS via penalized LRT	Asymptotic coverage of true order
Model class	Data-split MSS/MCS testing	Pointwise coverage of optimal classes
Decision analysis	Rule-based feature matching (MCDA-MSS)	Logical filtering, minimal mismatch, scoring
Time series (ETS)	Classifier-based component selection	Simulated data, component-level prediction

4. Applications Across Scientific and Decision Domains

MSS frameworks have found wide adoption in domains characterized by high model uncertainty or multiple mechanistic hypotheses:

Materials Science: MMPR exposes distinct sets of alloying elements explaining stacking fault energy in steel, allocating correlated predictors into distinct models for scientific interpretability (Wendelberger et al., 2020).
Genomics/Ecology: MSS identifies alternate sparse supports in regression, facilitating exploration of alternative gene or ecological factor sets.
Decision Analysis: The MCDA-MSS system operationalizes MSS for the selection of MCDA methods, enabling transparent, feature-driven recommendations and preventing methodological mistakes through diagnostic guidance (Cinelli et al., 2021).
Mixture Models: MSCS-based approach provides actionable sets of plausible mixture orders, especially important when traditional criteria such as BIC yield ambiguous solutions due to finite sample size and overlapping component distributions (Casa et al., 24 Mar 2025).
Forecasting: The feature-based fETSmcs system leverages MSS to scale time series model selection to tens of thousands of series efficiently while maintaining or improving predictive performance (Qi et al., 2022).

5. Key Theoretical Insights and Practical Recommendations

The foundational properties of MSS methods are:

Uncertainty Quantification: MSS methods articulate model selection uncertainty by providing confidence sets, plausibility sets, or explicit control over the degree of difference among reported models.
Coverage Guarantees: Approaches based on likelihood-ratio tests and studentized data splits provide pointwise or asymptotic coverage of the true model or class at a user-specified confidence level, under conditions such as model identifiability and suitable stability or sparsity (Zheng et al., 2017, Casa et al., 24 Mar 2025, Cecil et al., 14 Nov 2025).
Adaptivity: The granularity of the resulting model set adapts to available information; in informative settings, MSS often collapses to a singleton, while under ambiguity or high noise, the set justifiably expands (Casa et al., 24 Mar 2025, Zheng et al., 2017).
Interpretation: MSS, especially in the MMPR framework, systematically allocates highly correlated predictors into different models, and the choice of penalty parameters can control the degree of model support overlap (e.g., through cosine similarity thresholds) (Wendelberger et al., 2020).
Computational Feasibility: Coordinate-descent, adaptive stochastic search, and classifier-based approximations make MSS procedures feasible for high-dimensional and large-scale problems (Wendelberger et al., 2020, Zheng et al., 2017, Qi et al., 2022).

6. Extensions and Connections: Model Class Selection

Recent developments extend MSS to model class selection (MCS), where the task is to provide a confidence set of classes (e.g., linear models, tree-based ensembles) each containing at least one risk-minimizing model (Cecil et al., 14 Nov 2025). Data-splitting MSS tests directly support MCS, allowing valid comparison between interpretable and complex model classes on real data. MCS extends MSS's philosophy of uncertainty quantification to the level of model families, bridging the gap between fine-grained algorithm selection and high-level model interpretability requirements.

7. Validation, Case Studies, and Empirical Performance

Empirical validations of MSS methods across different domains demonstrate that the reported model sets reliably cover the true model with prescribed confidence and offer actionable sets of plausible alternatives:

MMPR-MSS on steel composition data: MSS solutions yielded models that pick distinct, highly correlated variables to explain a single response, with each model exhibiting competitive mean squared prediction error (Wendelberger et al., 2020).
MSCS studies in regression and mixture models: Coverage probabilities match or exceed nominal levels, and the cardinality of the model set reflects underlying signal-to-noise levels (Casa et al., 24 Mar 2025, Zheng et al., 2017).
MCDA-MSS real-world tests: In six of nine published MCDA case studies, the system's recommended method subset coincided with analyst choices when a full feature match existed; gap analysis helped identify and correct operator or methodological errors (Cinelli et al., 2021).
ETS component selection: Feature-based MSS approaches deliver computational efficiency and small, interpretable model sets with predictive accuracy matching or exceeding traditional AICc-based selection (Qi et al., 2022).

In sum, Model Set Selection is a unifying paradigm encompassing algorithmic, statistical, and decision-analytic techniques for quantifying model selection uncertainty, supporting scientific discovery, safe deployment, and robust decision-making across diverse domains (Wendelberger et al., 2020, Zheng et al., 2017, Casa et al., 24 Mar 2025, Cinelli et al., 2021, Qi et al., 2022, Cecil et al., 14 Nov 2025).