Characterize Adam’s implicit bias beyond the minimizer-manifold neighborhood

Determine Adam’s implicit bias when the optimization trajectory ventures beyond the neighborhood of a smooth minimizer manifold, developing an analysis capable of characterizing long-time dynamics without the local manifold assumption.

Background

The core analysis in the paper assumes that training proceeds near a smooth manifold of minimizers, enabling the use of projection operators and slow SDE techniques to isolate the implicit-bias dynamics.

The authors highlight that understanding Adam’s behavior when iterates move outside this local neighborhood remains open and may require restarting the analysis from SGD dynamics or developing new tools to handle departures from the manifold.

References

Despite these advances, several important avenues remain open. Second, our derivations assume that the iterates remain close to a smooth minimizer manifold; understanding Adam’s implicit bias once the trajectory ventures beyond this local neighborhood may require restarting the analysis from the SGD dynamics.