High-Dimensional Model Averaging via Cross-Validation (2506.08451v1)

Published 10 Jun 2025 in math.ST and stat.TH

Abstract: Model averaging is an important alternative to model selection with attractive prediction accuracy. However, its application to high-dimensional data remains under-explored. We propose a high-dimensional model averaging method via cross-validation under a general framework and systematically establish its theoretical properties. Each candidate model is fitted using a flexible loss function paired with a general regularizer, and the optimal weights are determined by minimizing a cross-validation criterion. When all candidate models are misspecified, we establish a non-asymptotic upper bound and a minimax lower bound for our weight estimator. The asymptotic optimality is also derived, showing that the proposed weight estimator achieves the lowest possible prediction risk asymptotically. When the correct models are included in the candidate model set, the proposed method asymptotically assigns all weights to the correct models, and the model averaging estimator achieves a nearly-oracle convergence rate. Further, we introduce a post-averaging debiased estimator and establish Gaussian and bootstrap approximation to construct simultaneous confidence intervals. A fast greedy model averaging (FGMA) algorithm is proposed to solve the simplex-constrained optimization problem, which has a descent property empirically and a faster convergence rate, compared to the original greedy model averaging algorithm. Empirical results demonstrate the strong competitiveness of the proposed method in prediction and inference, compared to other existing model averaging and selection methods.

Summary

We haven't generated a summary for this paper yet.

Summarize Now

High-Dimensional Model Averaging via Cross-Validation (2506.08451v1)

Summary

Related Papers