Convergence rates beyond convex or Polyak–Łojasiewicz regimes
Establish convergence rates for the Adam optimizer and for the time-homogeneous Adam-type SDE (eq:cts-x+)–(eq:cts-y+) under assumptions weaker than convexity or the Polyak–Łojasiewicz condition, for example under the dissipativity-at-infinity assumption (A2), including explicit nonasymptotic rates either to minimizers or to the invariant measure.
References
Nevertheless, important open questions remain, including the role of bias correction at finite horizons, convergence rates beyond convex or Polyak-Lojasiewicz regimes, robustness under heavy-tailed or state-dependent gradient noise, the structure of invariant measures induced by coordinatewise preconditioning, and metastability near saddle points in high dimensions.
— Fokker-Planck Analysis and Invariant Laws for a Continuous-Time Stochastic Model of Adam-Type Dynamics
(2604.00840 - Nyström, 1 Apr 2026) in Section 1, Introduction