A Modern Theory of Cross-Validation through the Lens of Stability (2505.23592v1)

Published 29 May 2025 in math.ST and stat.TH

Abstract: Modern data analysis and statistical learning are marked by complex data structures and black-box algorithms. Data complexity stems from technologies like imaging, remote sensing, wearables, and genomic sequencing. Simultaneously, black-box models -- especially deep neural networks -- have achieved impressive results. This combination raises new challenges for uncertainty quantification and statistical inference, which we term "black-box inference." Black-box inference is difficult due to the lack of traditional modeling assumptions and the opaque behavior of modern estimators. These make it hard to characterize the distribution of estimation errors. A popular solution is post-hoc randomization, which, under mild assumptions like exchangeability, can yield valid uncertainty quantification. Such methods range from classical techniques like permutation tests, jackknife, and bootstrap, to recent innovations like conformal inference. These approaches typically need little knowledge of data distributions or the internal working of estimators. Many rely on the idea that estimators behave similarly under small data changes -- a concept formalized as stability. Over time, stability has become a key principle in data science, influencing generalization error, privacy, and adaptive inference. This article investigates cross-validation (CV) -- a widely used resampling method -- through the lens of stability. We first review recent theoretical results on CV for estimating generalization error and model selection under stability. We then examine uncertainty quantification for CV-based risk estimates. Together, these insights yield new theory and tools, which we apply to topics like model selection, selective inference, and conformal prediction.

Summary

A Modern Theory of Cross-Validation through the Lens of Stability

This paper by Jing Lei offers a comprehensive theoretical development of cross-validation (CV) viewed through the stability paradigm, addressing its implications for modern data analytical contexts characterized by high-dimensional data and complex algorithmic structures. Traditional CV, often used for assessing model prediction errors in statistical learning, is re-examined under a stability framework that provides a more robust understanding of its theoretical properties including risk estimation, model selection, and consistency.

The essence of this work is to reconcile the inherent challenges in cross-validation with the rapid advancements in modern statistical learning techniques and data complexities. The notion of stability, which intuitively assesses the impact of small perturbations in data on model estimations, is pivotal to this reconciliation. The stability perspective is leveraged to articulate precise conditions under which cross-validation estimates are consistent and normally distributed.

Risk Estimation and Stability

Central to the paper is the focus on risk estimation via cross-validation. Lei incorporates classical concepts such as the leave-one-out and $K$ -fold methods and extends these with novel stability arguments. Crucially, the stability conditions are expressed in terms of perturbed-data behaviors, ensuring that model evaluation reflects true prediction capacities. The analytics underscore that, under specific stability conditions related to perturbation invariance and bounded empirical loss functions, CV estimates not only converge to actual risks but also achieve distributional approximations, notably via central limit theorems.

Model Selection in Complex Domains

The discussion on model selection notably addresses situations where typical assumptions regarding parametric models fail or where true models are unknown. Lei deftly couples CV with stability arguments to establish procedures that consistently select optimal models within finite candidate sets. Theoretical contributions in this context offer refined perspectives on model consistency—especially by employing stochastic dominance criteria and accounting for regularity assumptions degrading with sample splitting.

Central Limit Theorems for Cross-Validation

A key advancement of the work is the formulation of central limit theorems both with random and deterministic centering for cross-validation risk estimates. These theorems provide a rigorous basis for understanding the distributional behavior of CV risk estimates across varying conditions denoting different types of stability. The implications extend to high-dimensional settings where model risk evaluations depend on simultaneous approximations, substantially covered by the deployment of multivariate Gaussian approximations.

Applications and Methodologies

The paper elucidates diverse applications for the developed theories, such as constructing model confidence sets that allow robust model assessments and enhanced model selections. Additionally, it proposes an innovative strategy for evaluating prediction confidence in nonparametric and semiparametric paradigms, showcasing the versatility of the stability framework.

Practical Implications

The implications of this research are profound. Practitioners gain refined methodologies for deploying cross-validation techniques more reliably in settings where data structures are dense or parameters are inherently high-dimensional. The leveraging of stability not only enhances precision in model evaluation and selection but also reduces biases intrinsic to traditional estimators.

Concluding Discussions

Overall, Jing Lei provides a groundbreaking re-examination of cross-validation processes through stability, enriching statistical learning literature with robust, theoretically assured frameworks. Future directions, particularly around adaptive and privacy-preserving inference tools anchored on stability insights, promise continued relevance of this seminal work in advancing machine learning methodologies and their applications across varied sectors.

This paper contributes significantly towards aligning statistical theory with computational advancements, ensuring enduring relevance and adaptability of CV methodologies in evolving analytical ecosystems.

Related Papers

Invariance, Causality and Robustness (2018)
Distribution-Free Predictive Inference For Regression (2016)
Cross-validation: what does it estimate and how well does it do it? (2021)
Classification with Valid and Adaptive Coverage (2020)
Stability (2013)

Tweets

https://twitter.com/_onionesque/status/1928952428897894844

https://twitter.com/mathSTb/status/1928320814186909903

https://twitter.com/predict_addict/status/1930345251211813149