Papers
Topics
Authors
Recent
Search
2000 character limit reached

The Anytime Convergence of Stochastic Gradient Descent with Momentum: From a Continuous-Time Perspective

Published 30 Oct 2023 in math.OC | (2310.19598v6)

Abstract: We study the stochastic optimization problem from a continuous-time perspective, with a focus on the Stochastic Gradient Descent with Momentum (SGDM) method. We show that the trajectory of SGDM, despite its \emph{stochastic} nature, converges in $L_2$-norm to a \emph{deterministic} second-order Ordinary Differential Equation (ODE) as the stepsize goes to zero. The connection between the ODE and the algorithm results in a useful development for the discrete-time convergence analysis. More specifically, we develop, through the construction of a suitable Lyapunov function, convergence results for the ODE, which are then translated to the corresponding convergence results for the discrete-time case. This approach yields a novel \emph{anytime} convergence guarantee for stochastic gradient methods. In particular, we prove that the sequence ${ x_k }$, governed by running SGDM on a smooth convex function $f$, satisfies \begin{align*} \mathbb{P}\left(f (x_k) - f* \le C\left(1+\log\frac{1}{\beta}\right)\frac{\log k}{\sqrt{k}},\;\text{for all $k$}\right)\ge 1-\beta\quad\text{ for any $\beta>0$,} \end{align*} where $f*=\min_{x\in\mathbb{R}n} f(x)$, and $C$ is a constant. Rather than at a single step, this result captures the convergence behavior across the entire trajectory of the algorithm.

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.