Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash 102 tok/s
Gemini 2.5 Pro 58 tok/s Pro
GPT-5 Medium 25 tok/s
GPT-5 High 35 tok/s Pro
GPT-4o 99 tok/s
GPT OSS 120B 472 tok/s Pro
Kimi K2 196 tok/s Pro
2000 character limit reached

Realized Mallows’s Cₚ: Adaptive Model Selection

Updated 22 August 2025
  • The paper extends traditional Cₚ by incorporating empirical degrees of freedom, enabling adaptive tuning in complex estimation workflows.
  • It quantifies model complexity in penalized and fused lasso frameworks, balancing predictive fit and sparsity in high-dimensional settings.
  • Empirical simulations and real-data applications demonstrate improved mean squared error and block recovery compared to conventional methods.

Realized Mallows’s CpC_p is an adaptively-computed information criterion for model selection and tuning parameter choice within high-dimensional statistical and time series models. It extends the classical CpC_p by explicitly accounting for realized degrees of freedom in modern penalized estimation workflows, offering a data-driven measure of predictive error adjusted for complexity. The concept surfaces prominently in high-dimensional regression consistency analysis (Bai et al., 2018) and adaptive sparsity estimation in matrix factor models (Cen et al., 17 Aug 2025), establishing a principled framework for model selection beyond fixed-dimensional settings.

1. Theoretical Development

In classical regression, Mallows’s CpC_p serves as an unbiased estimator of out-of-sample prediction error, combining model fit with a penalty term based on parameter count. The realized form adapts this framework to complex estimators whose effective degrees of freedom are not known a priori, such as penalized or fused lasso solutions. Specifically, in adaptive 1\ell_1-penalized regression with block fusion, the realized CpC_p employs the “nullity” of the constraint matrix generated by active penalties to quantify model complexity: C^p(λ)=α~α^22Tσ^2+2σ^2nullity(DA)\widehat{C}_p(\lambda) = \|\widetilde{\alpha} - \widehat{\alpha}\|_2^2 - T\hat{\sigma}^2 + 2\hat{\sigma}^2 \cdot \mathrm{nullity}(D_A) where α~\widetilde{\alpha} is an initial estimate, α^\widehat{\alpha} the penalized solution, TT is sample size, σ^2\hat{\sigma}^2 estimates error variance, and nullity(DA)\mathrm{nullity}(D_A) counts unconstrained dimensions in the fitted model (Cen et al., 17 Aug 2025). In high-dimensional regression, realized CpC_p generalizes via sample-based noncentrality measures (κom\kappa_{\text{om}}) and penalty-adjusted fit statistics evaluated against asymptotic regimes determined by predictor-to-sample and response-to-sample ratios (Bai et al., 2018).

2. High-Dimensional Consistency Criteria

Analysis of realized CpC_p in high-dimensional multivariate regression introduces explicit necessary and sufficient conditions for strong consistency. The criterion depends on limiting ratios α=limn(k/n)\alpha = \lim_n (k/n) and c=limn(p/n)c = \lim_n (p/n), and functions such as ψ(α,c)\psi(\alpha, c):

  • Strong consistency holds if ψ(α,c)>0\psi(\alpha, c) > 0 and, for every underspecified candidate, the realized noncentrality κom\kappa_{\text{om}} exceeds a threshold proportional to model mis-specification and high-dimensional scaling.
  • If ψ(α,c)<0\psi(\alpha, c) < 0, the method is almost surely over-specified.
  • The threshold for κom\kappa_{\text{om}} involves realized data-dependent values rather than fixed model counts, reflecting the inherent adaptivity of the criterion (Bai et al., 2018).

This framework reinterprets CpC_p beyond fixed-dimensional asymptotics, revealing sensitivity to joint growth of predictors, responses, and sample size. Realized CpC_p is intimately linked to analogous conditions for AIC and BIC, but can reverse classical relationships: in high dimensions, strong consistency for BIC implies the same for AIC but not vice versa, and CpC_p consistency is determined by sign and magnitude of the ψ\psi and κ\kappa realizations.

3. Practical Implementation in Adaptive Penalized Models

In matrix factor modeling for time series, realized Mallows’s CpC_p becomes central in tuning parameter selection:

  • The doubly adaptive fused lasso estimator introduces fusion penalties and adaptive weights determined from initial estimates.
  • The solution path is computed via generalized lasso algorithms, with computational complexity O(T3)O(T^3) per block (Cen et al., 17 Aug 2025).
  • For each candidate penalty parameter, the nullity of the active constraint matrix supplies the realized degree of freedom.
  • Minimization of C^p(λ)\widehat{C}_p(\lambda) over tuning parameters yields an estimator balancing fit and adaptively computed complexity, ensuring accurate detection of block-wise sparsity and optimal shrinkage.

The realized criterion replaces heuristic penalty choices with an empirically justified, model-specific information measure, delivering improved estimation accuracy and block recovery in simulation and real data (e.g., NYC taxi volume responses to COVID-19 lockdown).

4. Simulation and Empirical Evidence

Simulation studies support the utility of realized CpC_p:

  • In high-dimensional regression, realized CpC_p convergence depends acutely on the interplay between model parameters and sample size. Sensitive tuning ensures consistent selection only when analytic thresholds (ψ(α,c)\psi(\alpha, c), κom\kappa_{\text{om}}) are satisfied.
  • In sparse matrix factor models, models selected via minimized realized CpC_p display uniformly lower MSE and higher block recovery rates, approaching oracle sensitivity and improving specificity with increasing sparsity (Cen et al., 17 Aug 2025).
  • Real-world applications capture substantive structural changes in covariate effects, as evidenced by estimated main effect matrices during data disruptions.

5. Comparison and Extensions

The realized CpC_p aligns closely with generalized information criteria in penalized estimation:

  • Unlike AIC/BIC, realized CpC_p incorporates empirical degrees of freedom derived from constraint structure.
  • In high-dimensional settings, performance depends less on absolute dimension and more on relative proportions and sample-specific noncentrality.
  • Elimination of additive penalty constants (e.g., in leave-one-out or KOO variants) can simplify consistency requirements and improve convergence (Bai et al., 2018).

The principle extends to generalized lasso, adaptive blockwise estimation, and models with complex dependency or signal structures.

6. Limitations and Sensitivity

While realized CpC_p offers adaptive and theoretically robust model selection, its performance may degrade in regimes with poorly estimated degrees of freedom (e.g., near-singular constraint matrices) or in scenarios with insufficient signal-to-noise contrast. High-dimensional settings amplify sensitivity to tuning parameter selection and signal specification. Simulation evidence indicates dependence primarily on the first two error moments, mitigating concerns over non-Gaussian tail behavior.

7. Impact and Future Directions

The “realized” approach to Mallows’s CpC_p provides a foundation for model selection that integrates data-dependent complexity into conventional information criteria. This perspective advances both variable selection for high-dimensional inference (Bai et al., 2018) and adaptive estimation in matrix factor time series models (Cen et al., 17 Aug 2025), promoting interpretability, consistency, and practical accuracy. Future directions may further incorporate nonparametric degrees of freedom, hierarchical penalty structures, and extensions to more general dependency networks and multiway arrays.