Convergence rates beyond convex or Polyak–Łojasiewicz regimes

Establish convergence rates for the Adam optimizer and for the time-homogeneous Adam-type SDE (eq:cts-x+)–(eq:cts-y+) under assumptions weaker than convexity or the Polyak–Łojasiewicz condition, for example under the dissipativity-at-infinity assumption (A2), including explicit nonasymptotic rates either to minimizers or to the invariant measure.

Background

The analysis in the paper proves existence, uniqueness, and exponential convergence to a unique invariant measure for the limiting diffusion under smoothness and dissipativity. However, optimization convergence rates for Adam beyond convex or Polyak–Łojasiewicz settings are left unresolved.

Deriving rates in general nonconvex regimes would bridge optimization guarantees with the stochastic stability and ergodicity results obtained for the continuous-time model.

References

Nevertheless, important open questions remain, including the role of bias correction at finite horizons, convergence rates beyond convex or Polyak-Lojasiewicz regimes, robustness under heavy-tailed or state-dependent gradient noise, the structure of invariant measures induced by coordinatewise preconditioning, and metastability near saddle points in high dimensions.

— Fokker-Planck Analysis and Invariant Laws for a Continuous-Time Stochastic Model of Adam-Type Dynamics (2604.00840 - Nyström, 1 Apr 2026) in Section 1, Introduction

Convergence rates beyond convex or Polyak–Łojasiewicz regimes

Background

References

Related Problems