Rates of convergence for Lasso with SURE-based penalty selection

Determine whether the Lasso estimator in the high-dimensional linear mean regression model Y = X^T β + e, with the penalty parameter chosen via Stein’s unbiased risk estimation (SURE), namely λ̂^S = argmin_{λ>0} { n^{-1} ∑_{i=1}^n (Y_i − X_i^T β̂(λ))^2 + (2σ^2/n) ||β̂(λ)||_0 − σ^2 }, satisfies the standard high-dimensional convergence rates ||β̂(λ̂^S) − β||_2 = O_P(√(s log p / n)) and ||β̂(λ̂^S) − β||_1 = O_P(√(s^2 log p / n)).

Background

The paper reviews several methods for selecting tuning parameters in high-dimensional Lasso estimation. For the SURE-based approach, the authors define λ̂^S by minimizing an unbiased estimator of prediction risk under homoskedastic Gaussian errors. While this yields a practical, data-driven choice of λ and has appealing intuition via Stein’s lemma, its theoretical properties in high-dimensional regimes remain less developed than methods based on self-normalized moderate deviations or bootstrap, which deliver standard rates.

Specifically, the canonical Lasso rates ||β̂ − β||_2 = O_P(√(s log p / n)) and ||β̂ − β||_1 = O_P(√(s² log p / n)) are known under appropriate λ choices derived from deviation bounds. The paper notes a gap in the literature regarding whether the SURE-selected λ achieves these same rates, despite some progress such as variance bounds for the model size ||β̂(λ)||_0.

References

[BZ21] derived a bound for the variance of |\widehat\beta(\lambda)|_0 but, to the best of our knowledge, it is still not clear if the Lasso estimator \widehat\beta = \widehat\beta(\widehat\lambda^S) satisfies eq: lasso rate of convergence.

eq: lasso rate of convergence:

$\|\widehat \beta - \beta\|_{2} = O_P\left(\sqrt{\frac{s\log p}{n}}\right)\quad\text{and}\quad \|\widehat \beta - \beta\|_{1} = O_P\left(\sqrt{\frac{s^2\log p}{n}}\right).$

— Tuning parameter selection in econometrics (2405.03021 - Chetverikov, 5 May 2024) in Section 3.4 (Selection via Stein Method)

Rates of convergence for Lasso with SURE-based penalty selection

Background

References

Related Problems