Non-asymptotic state-evolution conjecture for quadratic networks (matrix compressed sensing)

Establish that, for over-parameterized two-layer quadratic neural networks trained with weight decay—equivalently, nuclear-norm regularized matrix compressed sensing—with any fixed λ>0 and Δ≥0, and for sufficiently large sample size n and dimension d, both the empirical risk minimizer’s excess risk and the Bayes-optimal risk concentrate within multiplicative o(1) around the deterministic quantities predicted by AMP state evolution fixed-point equations for these models.

Background

The authors map training of quadratic networks with weight decay to nuclear-norm regularized matrix compressed sensing and analyze performance using AMP state evolution. Rigorous guarantees for AMP typically require proportional asymptotics (fixed n/d and λ). Their experiments show excellent agreement beyond these assumptions, motivating a formal conjecture that the SE predictions remain accurate non-asymptotically.

Proving this conjecture would place the derived phase diagrams and scaling laws on rigorous footing, clarifying when SE-based predictions reliably approximate finite-sample behavior for both ERM and Bayes-optimal estimators in quadratic networks.

References

Conjecture. Let \lambda>0, \Delta\geq0 and consider n,d\gg 1 sufficiently large. Then with a probability at least 1-o_n(1)-o_d(1), both the excess risk associated to the empirical risk minimizer (\ref{eq:def:quadratic_network}) and the Bayes-optimal risk satisfy |R(\hat{S}) - \mathsf R_{n,d}| = \mathsf R_{n,d}\cdot o_{n,d}(1).

Scaling Laws and Spectra of Shallow Neural Networks in the Feature Learning Regime (2509.24882 - Defilippis et al., 29 Sep 2025) in Section Non-asymptotic state evolution, Conjecture (label \texttt{conjecture:quadratic})