Revisiting Randomization in Greedy Model Search (2506.15643v1)

Published 18 Jun 2025 in stat.ML and cs.LG

Abstract: Combining randomized estimators in an ensemble, such as via random forests, has become a fundamental technique in modern data science, but can be computationally expensive. Furthermore, the mechanism by which this improves predictive performance is poorly understood. We address these issues in the context of sparse linear regression by proposing and analyzing an ensemble of greedy forward selection estimators that are randomized by feature subsampling -- at each iteration, the best feature is selected from within a random subset. We design a novel implementation based on dynamic programming that greatly improves its computational efficiency. Furthermore, we show via careful numerical experiments that our method can outperform popular methods such as lasso and elastic net across a wide range of settings. Next, contrary to prevailing belief that randomized ensembling is analogous to shrinkage, we show via numerical experiments that it can simultaneously reduce training error and degrees of freedom, thereby shifting the entire bias-variance trade-off curve of the base estimator. We prove this fact rigorously in the setting of orthogonal features, in which case, the ensemble estimator rescales the ordinary least squares coefficients with a two-parameter family of logistic weights, thereby enlarging the model search space. These results enhance our understanding of random forests and suggest that implicit regularization in general may have more complicated effects than explicit regularization.

Summary

The paper introduces the RGS method that improves predictive accuracy and computational efficiency in sparse linear regression through randomized feature subsampling.
It employs dynamic programming to overcome computational constraints and explore larger model spaces compared to traditional greedy selection.
Experimental results show that RGS consistently outperforms lasso and elastic net while refining the bias-variance trade-off.

Revisiting Randomization in Greedy Model Search

The paper "Revisiting Randomization in Greedy Model Search" introduces a novel approach to ensemble learning within the framework of sparse linear regression. The study extends traditional greedy forward selection methods by incorporating a randomized feature subsampling technique, resulting in the Randomized Greedy Search (RGS) method. This technique aims to address the computational inefficiencies and the limited predictive understanding associated with randomized ensembles like random forests in the context of linear regression.

At the core of the method, RGS performs feature subsampling at each iteration of the greedy forward selection process. Through this randomization, it selects the best feature subset to improve computational efficiency and enhance predictive performance. The authors propose a dynamic programming-based implementation of RGS that is claimed to overcome the typical computational constraints of such ensemble methods. Notably, RGS is shown, through extensive experimentation, to consistently outperform popular methods such as lasso and elastic net across various settings.

The authors challenge the prevailing view that the benefit of randomized ensembling is analogous to shrinkage effects. Through rigorous numerical demonstrations and theoretical analyses, they show that RGS can simultaneously reduce training error and degrees of freedom, leading to a simultaneous shift in the bias-variance trade-off. In scenarios with orthogonal features, the model's predictive power increases as it effectively explores a larger model space compared to non-randomized methods.

The paper underscores several implications for both theoretical understanding and practical applications of such methods in sparse linear regression. It suggests that implicit regularization, facilitated by RGS's randomization technique, may have more intricate consequences than explicit regularization traditionally achieved by methods like ridge or lasso regression.

The method's theoretical underpinnings are rigorously explored under orthogonal feature settings. The authors provide mathematical proofs to support the claims on training error improvements and degrees-of-freedom assessments, leveraging theoretical tools such as majorization theory and asymptotic analysis. These analyses reveal that RGS's weight distribution across coefficients can approximate logistic functions, again supporting the view of flexible model space exploration beyond traditional greedy heuristics.

Practically, the RGS method's performance is compelling across a range of simulation setups, demonstrating its robustness across different signal-to-noise ratios, feature correlations, and sparsity structures. Such performance stability indicates potential practical applications of RGS in areas demanding high precision and interpretability, such as causal inference in high-dimensional datasets.

In discussing the future developments in AI, the study encourages a consideration of randomization's broader role in improving model generalizability and interpretability. This suggests a shift towards a nuanced understanding of randomization not just as a variance reducer but also as an explorative tool enhancing bias reduction opportunities in model ensembles.

Overall, the paper critically evaluates and demonstrates that randomized feature selection within greedy search procedures can result in computationally manageable, more accurate models that leverage larger model spaces for robust prediction in high-dimensional regression settings. This work stimulates further consideration of randomization in ensemble learning, presenting a path that could influence future research and applications within AI and data science.