Non-asymptotic state-evolution conjecture for diagonal networks (LASSO)

Establish that, for diagonal two-layer linear networks with ℓ2 weight decay—equivalently, LASSO regression—with any fixed λ>0 and Δ≥0, and for sufficiently large sample size n and dimension d, both the empirical risk minimizer’s excess risk and the Bayes-optimal risk concentrate within multiplicative o(1) around the deterministic quantities predicted by AMP state evolution fixed-point equations for LASSO.

Background

The paper shows that diagonal two-layer networks trained with ℓ2 weight decay are equivalent to LASSO. It derives excess risk and spectrum predictions via AMP state evolution and observes strong empirical validity beyond proportional asymptotics. The authors therefore pose an explicit conjecture that these SE predictions hold non-asymptotically for both ERM and Bayes-optimal risks.

A proof would generalize recent non-asymptotic advances for ridge regression to sparse recovery, consolidating the heuristic use of SE for finite-sample LASSO performance across a range of scalings.

References

Conjecture. Let \lambda>0, \Delta\geq0 and consider n,d\gg 1 sufficiently large. Then with a probability at least 1-o_n(1)-o_d(1), both the excess risk associated to the empirical risk minimizer (\ref{eq:def:lasso}) and the Bayes-optimal risk satisfy |R(\hat{\theta}) - \mathsf R_{n,d}| = \mathsf R_{n,d}\cdot o_{n,d}(1).

— Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime (2509.24882 - Defilippis et al., 29 Sep 2025) in Section Non-asymptotic state evolution, Conjecture (following the quadratic-network conjecture)

Non-asymptotic state-evolution conjecture for diagonal networks (LASSO)

Background

References

Related Problems