Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso (2508.14368v1)

Published 20 Aug 2025 in stat.ML, cs.LG, and stat.CO

Abstract: I develop an algorithm to produce the piecewise quadratic that computes leave-one-out cross-validation for the lasso as a function of its hyperparameter. The algorithm can be used to find exact hyperparameters that optimize leave-one-out cross-validation either globally or locally, and its practicality is demonstrated on real-world data sets.

Summary

The paper introduces LO-LARS that accurately computes leave-one-out errors to optimize the lasso shrinkage parameter through efficient algorithmic design.
It leverages the matrix inversion lemma within a modified LARS framework to drastically reduce computation time, particularly for high-dimensional data.
Empirical tests on real-world datasets reveal that LO-LARS reliably finds the global optimum with reduced bias and faster convergence.

Evaluation and Optimization of Leave-one-out Cross-Validation for the Lasso (2508.14368)

Introduction

The paper "Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso" addresses the optimization challenges faced when selecting the hyperparameter for the least absolute shrinkage and selection operator (lasso). The lasso is a linear regression method that incorporates both feature selection and regularization by constraining the $l_1$ norm of the regression coefficients. The optimal selection of its shrinkage parameter $t$ is crucial for balancing bias and variance, and ultimately for enhancing the prediction accuracy over ordinary least squares (OLS) regression. The importance of the leave-one-out cross-validation (LO) method stands out due to its low bias compared to other v-fold cross-validation methodologies, despite its reputation for potentially higher variance.

The paper introduces an algorithm named leave-one-out least-angle regression (LO-LARS), which efficiently computes the piecewise quadratic function necessary for LO computation. This allows for accurate and practical estimation of the hyperparameter that optimizes the leave-one-out cross-validation, both globally and locally.

Algorithm Description

The LO-LARS algorithm integrates advances made in the approximation of LO for high-dimensional data with the least angle regression (LARS) algorithm. The LARS procedure is known for efficiently computing lasso solution paths. Unlike the traditional LARS, which constructs a solution path as a function of $t$ in incremental steps, LO-LARS builds these paths for every leave-one-out subset of the data, thus producing leave-one-out errors with high precision.

By leveraging the matrix inversion lemma, LO-LARS significantly reduces computational overhead, simplifying the detection of changes in activation states across subproblems to compute LO accurately. This represents a substantial improvement in computational efficiency, making the exact computation of LO feasible for many practical datasets.

Numerical Experiments

The practical applicability of LO-LARS is demonstrated across various real-world datasets of different dimensions, including the well-known diabetes dataset and a high-dimensional riboflavin gene expression dataset. Through these empirical studies, the performance, efficiency, and limitations of LO-LARS are thoroughly examined.

The benchmarks provided indicate that LO-LARS is sufficiently fast for many practical datasets. For instance, in the case of the riboflavin dataset, where riboflavin levels and transcription of 4088 genes were measured in the bacterial culture of Bacillus subtilis, the algorithm demonstrates the capability to compute the LO function efficiently, finding the global optimum with early stopping criteria in approximately 0.5 seconds, contrasting with a full computation time of slightly over 3 seconds without early stopping.

Theoretical Foundations

The paper further develops the theoretical groundwork for understanding and implementing efficient solution paths for lasso problems. Detailed algorithms were concocted, leveraging convex optimization theory and incorporating the matrix inversion lemma to ensure computational feasibility.

The author discusses several key theorems, including those adapted from works by noted researchers like Ryan Tibshirani and Saharon Rosset, to elucidate properties of solution paths. Through rigorous proofs, the paper delineates conditions under which solution paths possess unique properties such as continuity and rank considerations, enhancing the robustness of LO-LARS in practical applications.

Implications and Future Directions

The integration of exact LO computation with lasso regression can significantly impact high-dimensional model selection by providing a more accurate measure of prediction error with reduced bias. Although generalized cross-validation (GCV) may outperform in certain problem structures where the design matrix is near-diagonal, LO retains its standing as a default method for many scenarios, due to its lower bias in n-fold cross-validation settings.

Future research may investigate adaptations of LO-LARS for structured datasets where the direct application of LO cross-validation is suboptimal. Additionally, exploring the potential of integrating LO-LARS with other regularization techniques and extending it to other statistical models or complex data transformations, such as those involving generalized linear models or high-dimensional datasets, could offer further advancements.

Conclusion

"Evaluation and Optimization of Leave-one-out Cross-validation for the Lasso" presents an efficient algorithm, LO-LARS, facilitating practical computation of the leave-one-out cross-validation for lasso problems. By optimizing the lasso shrinkage parameter, LO-LARS provides an exact means to minimize LO cross-validation errors, thereby offering enhanced model performance in comparison to traditional methods. This research underscores the importance of reducing bias in high-dimensional settings and advocates LO as a reliable estimation approach, albeit with considerations for specific dataset structures where alternative methods like GCV might be advantageous. Future research directions are poised to adapt these methodologies to an even broader class of problems and explore novel applications.