- The paper comprehensively compares Best Subset Selection, Forward Stepwise Selection, and Lasso methods for variable selection across various scenarios, evaluating their predictive performance.
- It finds no universal dominance between Best Subset Selection and Lasso, with Best Subset performing better in high SNR and Lasso in low SNR conditions, while Forward Stepwise shows surprising similarity to Best Subset.
- The study highlights the Relaxed Lasso as a versatile approach that adapts well across different SNR levels, effectively combining strengths of Best Subset and Lasso.
Analysis of Variable Selection Methods in Regression
This paper presents a comprehensive evaluation of several popular methods for variable selection in linear regression models: Best Subset Selection, Forward Stepwise Selection, and the Lasso, alongside a simplified version of the Relaxed Lasso. The motivation for this investigation stems from advances in optimization algorithms, particularly those offered by recent formulations of Best Subset Selection as a Mixed Integer Optimization (MIO) problem, which have made it feasible to solve larger instances than previously possible.
The paper's core contribution is a detailed comparison of these methods across various scenarios differing in signal-to-noise ratio (SNR), dimensionality, and sparsity of the true model coefficients. The analysis highlights three key findings:
- No Clear Dominance: Neither Best Subset Selection nor the Lasso universally outperforms the other. Best Subset Selection shows superior predictive performance in high SNR circumstances, while the Lasso is preferable in low SNR scenarios due to its bias-variance trade-off properties.
- Similarity Between Best Subset and Forward Stepwise Selection: The paper finds that Forward Stepwise Selection and Best Subset Selection yield almost similar performance across most scenarios, which contrasts with prior findings suggesting substantial differences.
- Relaxed Lasso's Robust Performance: The Relaxed Lasso emerges as the most versatile approach, combining the strengths of both Best Subset and Lasso under varying conditions. It adapts well across different SNR levels by appropriately tuning its shrinkage parameter.
The paper's empirical phase involves simulations across diverse settings, emulating realistic regression environments with varying degrees of predictor correlation, coefficient sparsity, and SNR. These setups allow for the examination of model performance relative to oracle estimates and across different measures of predictive accuracy, including Relative Risk, Relative Test Error, and Proportion of Variance Explained (PVE).
Key Methodological Approaches
- Best Subset Selection is explored through an MIO framework utilizing Gurobi solver, enabling the evaluation of large-scale instances despite its NP-hard complexity. The efficiency of this method is noted to be significantly improved compared to traditional branch-and-bound techniques.
- Forward Stepwise Selection maintains its relevance by showing strong performance and computational efficiency due to its stepwise inclusion of variables.
- The Lasso, a convex relaxation of the subset selection problem, is known for its computational tractability and effectiveness in scenarios with high-dimensional data where traditional methods falter.
- Relaxed Lasso enhances Lasso by mitigating its inherent shrinkage, providing a balance between aggression in variable selection and stability in coefficient estimation.
Computational Considerations
While the computations for the Lasso and Relaxed Lasso are comparably efficient, the MIO-based Best Subset Selection demands significantly higher computational resources, particularly when the subset size or problem dimensionality increases. The implementation utilized a time constraint of three minutes per subset size, affecting the ability to guarantee optimal solutions across all instances.
Implications and Future Directions
The findings underscore the importance of context in selecting an appropriate variable selection method. They suggest that while advancements in optimization have made traditional combinatorial approaches more feasible, they do not outright replace methods like the Lasso for practical applications, especially in high-dimensional settings with low SNR. The Relaxed Lasso's performance positions it as a flexible and powerful tool in many real-world applications.
Future research could focus on exploring hybrid methods that incorporate advantages of multiple strategies, further enhancements in computational efficiency for MIO formulations, and investigation into other estimation metrics such as those focused on variable recovery. Additionally, real-world applications extending beyond simulations could provide further insights into method efficacy across different domains. The paper's R package, bestsubset
, sets a foundation for such explorations, enabling reproducibility and extended simulation studies.