Extend slow-SDE analysis beyond the 2-scheme scaling of β2
Extend the slow SDE/ODE analysis and implicit-bias characterization for Adam and the broader AGM framework from the 2-scheme regime (where 1−β2=Θ(η^2)) to the intermediate 1.5-scheme and other scalings of 1−β2, deriving the correct continuous-time limit and identifying the associated sharpness regularizer that these methods implicitly minimize.
Sponsor
References
Despite these advances, several important avenues remain open. First, we have focused on the â2-schemeâ regime (where $1-\beta_2=O(\eta2)$) in order to track Adamâs preconditioner over a long timescale; extending our analysis to the intermediate 1.5-scheme or other scalings of $1-\beta_2$ is left for future work.
— Adam Reduces a Unique Form of Sharpness: Theoretical Insights Near the Minimizer Manifold
(2511.02773 - Li et al., 4 Nov 2025) in Conclusions