Realized Mallows’s Cₚ: Adaptive Model Selection

Updated 22 August 2025

The paper extends traditional Cₚ by incorporating empirical degrees of freedom, enabling adaptive tuning in complex estimation workflows.
It quantifies model complexity in penalized and fused lasso frameworks, balancing predictive fit and sparsity in high-dimensional settings.
Empirical simulations and real-data applications demonstrate improved mean squared error and block recovery compared to conventional methods.

Realized Mallows’s $C_p$ is an adaptively-computed information criterion for model selection and tuning parameter choice within high-dimensional statistical and time series models. It extends the classical $C_p$ by explicitly accounting for realized degrees of freedom in modern penalized estimation workflows, offering a data-driven measure of predictive error adjusted for complexity. The concept surfaces prominently in high-dimensional regression consistency analysis (Bai et al., 2018) and adaptive sparsity estimation in matrix factor models (Cen et al., 17 Aug 2025), establishing a principled framework for model selection beyond fixed-dimensional settings.

1. Theoretical Development

In classical regression, Mallows’s $C_p$ serves as an unbiased estimator of out-of-sample prediction error, combining model fit with a penalty term based on parameter count. The realized form adapts this framework to complex estimators whose effective degrees of freedom are not known a priori, such as penalized or fused lasso solutions. Specifically, in adaptive $\ell_1$ -penalized regression with block fusion, the realized $C_p$ employs the “nullity” of the constraint matrix generated by active penalties to quantify model complexity: $\widehat{C}_p(\lambda) = \|\widetilde{\alpha} - \widehat{\alpha}\|_2^2 - T\hat{\sigma}^2 + 2\hat{\sigma}^2 \cdot \mathrm{nullity}(D_A)$ where $\widetilde{\alpha}$ is an initial estimate, $\widehat{\alpha}$ the penalized solution, $T$ is sample size, $\hat{\sigma}^2$ estimates error variance, and $\mathrm{nullity}(D_A)$ counts unconstrained dimensions in the fitted model (Cen et al., 17 Aug 2025). In high-dimensional regression, realized $C_p$ generalizes via sample-based noncentrality measures ( $\kappa_{\text{om}}$ ) and penalty-adjusted fit statistics evaluated against asymptotic regimes determined by predictor-to-sample and response-to-sample ratios (Bai et al., 2018).

2. High-Dimensional Consistency Criteria

Analysis of realized $C_p$ in high-dimensional multivariate regression introduces explicit necessary and sufficient conditions for strong consistency. The criterion depends on limiting ratios $\alpha = \lim_n (k/n)$ and $c = \lim_n (p/n)$ , and functions such as $\psi(\alpha, c)$ :

Strong consistency holds if $\psi(\alpha, c) > 0$ and, for every underspecified candidate, the realized noncentrality $\kappa_{\text{om}}$ exceeds a threshold proportional to model mis-specification and high-dimensional scaling.
If $\psi(\alpha, c) < 0$ , the method is almost surely over-specified.
The threshold for $\kappa_{\text{om}}$ involves realized data-dependent values rather than fixed model counts, reflecting the inherent adaptivity of the criterion (Bai et al., 2018).

This framework reinterprets $C_p$ beyond fixed-dimensional asymptotics, revealing sensitivity to joint growth of predictors, responses, and sample size. Realized $C_p$ is intimately linked to analogous conditions for AIC and BIC, but can reverse classical relationships: in high dimensions, strong consistency for BIC implies the same for AIC but not vice versa, and $C_p$ consistency is determined by sign and magnitude of the $\psi$ and $\kappa$ realizations.

3. Practical Implementation in Adaptive Penalized Models

In matrix factor modeling for time series, realized Mallows’s $C_p$ becomes central in tuning parameter selection:

The doubly adaptive fused lasso estimator introduces fusion penalties and adaptive weights determined from initial estimates.
The solution path is computed via generalized lasso algorithms, with computational complexity $O(T^3)$ per block (Cen et al., 17 Aug 2025).
For each candidate penalty parameter, the nullity of the active constraint matrix supplies the realized degree of freedom.
Minimization of $\widehat{C}_p(\lambda)$ over tuning parameters yields an estimator balancing fit and adaptively computed complexity, ensuring accurate detection of block-wise sparsity and optimal shrinkage.

The realized criterion replaces heuristic penalty choices with an empirically justified, model-specific information measure, delivering improved estimation accuracy and block recovery in simulation and real data (e.g., NYC taxi volume responses to COVID-19 lockdown).

4. Simulation and Empirical Evidence

Simulation studies support the utility of realized $C_p$ :

In high-dimensional regression, realized $C_p$ convergence depends acutely on the interplay between model parameters and sample size. Sensitive tuning ensures consistent selection only when analytic thresholds ( $\psi(\alpha, c)$ , $\kappa_{\text{om}}$ ) are satisfied.
In sparse matrix factor models, models selected via minimized realized $C_p$ display uniformly lower MSE and higher block recovery rates, approaching oracle sensitivity and improving specificity with increasing sparsity (Cen et al., 17 Aug 2025).
Real-world applications capture substantive structural changes in covariate effects, as evidenced by estimated main effect matrices during data disruptions.

5. Comparison and Extensions

The realized $C_p$ aligns closely with generalized information criteria in penalized estimation:

Unlike AIC/BIC, realized $C_p$ incorporates empirical degrees of freedom derived from constraint structure.
In high-dimensional settings, performance depends less on absolute dimension and more on relative proportions and sample-specific noncentrality.
Elimination of additive penalty constants (e.g., in leave-one-out or KOO variants) can simplify consistency requirements and improve convergence (Bai et al., 2018).

The principle extends to generalized lasso, adaptive blockwise estimation, and models with complex dependency or signal structures.

6. Limitations and Sensitivity

While realized $C_p$ offers adaptive and theoretically robust model selection, its performance may degrade in regimes with poorly estimated degrees of freedom (e.g., near-singular constraint matrices) or in scenarios with insufficient signal-to-noise contrast. High-dimensional settings amplify sensitivity to tuning parameter selection and signal specification. Simulation evidence indicates dependence primarily on the first two error moments, mitigating concerns over non-Gaussian tail behavior.

7. Impact and Future Directions

The “realized” approach to Mallows’s $C_p$ provides a foundation for model selection that integrates data-dependent complexity into conventional information criteria. This perspective advances both variable selection for high-dimensional inference (Bai et al., 2018) and adaptive estimation in matrix factor time series models (Cen et al., 17 Aug 2025), promoting interpretability, consistency, and practical accuracy. Future directions may further incorporate nonparametric degrees of freedom, hierarchical penalty structures, and extensions to more general dependency networks and multiway arrays.

PDF Markdown Chat (Pro)

References (2)

Strong consistency of the AIC, BIC, $C_p$ and KOO methods in high-dimensional multivariate linear regression (2018)

Sparsity of the Main Effect Matrix Factor Model (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to Realized Mallow's $C_p$.