GD dominance versus ridge with negative regularization and moderate stepsizes
Determine whether gradient descent with moderate step sizes n/∥XX^T∥ < η < 2n/∥XX^T∥ continues to dominate ridge regression in excess risk when ridge regularization is allowed to be negative (λ < 0) for well-specified random-design linear regression under the paper’s assumptions on covariates and noise.
References
It is unclear if GD still dominates ridge regression when allowing \lambda<0. We conjecture that this is true when extending the stepsize for GD from small ones to moderate ones, n/|\XB\XB\top|< \eta < 2n/|\XB\XB\top|, with which GD oscillates in iterates but still monotonically decreases the empirical risk. This is left for future investigation.
                — Risk Comparisons in Linear Regression: Implicit Regularization Dominates Explicit Regularization
                
                (2509.17251 - Wu et al., 21 Sep 2025) in Concluding remarks, paragraph “Negative ridge and oscillatory GD”