From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima (2403.02418v2)

Published 4 Mar 2024 in cs.LG, cond-mat.dis-nn, and cond-mat.stat-mech

Abstract: We provide an analytical study of the evolution of the Hessian during gradient descent dynamics, and relate a transition in its spectral properties to the ability of finding good minima. We focus on the phase retrieval problem as a case study for complex loss landscapes. We first characterize the high-dimensional limit where both the number $M$ and the dimension $N$ of the data are going to infinity at fixed signal-to-noise ratio $\alpha = M/N$. For small $\alpha$, the Hessian is uninformative with respect to the signal. For $\alpha$ larger than a critical value, the Hessian displays at short-times a downward direction pointing towards good minima. While descending, a transition in the spectrum takes place: the direction is lost and the system gets trapped in bad minima. Hence, the local landscape is benign and informative at first, before gradient descent brings the system into a uninformative maze. Through both theoretical analysis and numerical experiments, we show that this dynamical transition plays a crucial role for finite (even very large) $N$: it allows the system to recover the signal well before the algorithmic threshold corresponding to the $N\rightarrow\infty$ limit. Our analysis sheds light on this new mechanism that facilitates gradient descent dynamics in finite dimensions, and highlights the importance of a good initialization based on spectral properties for optimization in complex high-dimensional landscapes.

References (57)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/LFUS/status/1838562700864925801

From Zero to Hero: How local curvature at artless initial conditions leads away from bad minima (2403.02418v2)

Summary

Related Papers

Tweets