Dice Question Streamline Icon: https://streamlinehq.com

Faster convergence of momentum-based methods in nonconvex optimization

Determine whether momentum-based methods, such as Nesterov's accelerated gradient descent, achieve faster convergence rates than standard gradient descent for nonconvex optimization problems, specifically with respect to finding second-order stationary points that avoid strict saddle points.

Information Square Streamline Icon: https://streamlinehq.com

Background

Prior work established that accelerated gradient methods improve rates over gradient descent in convex optimization, but these analyses do not extend directly to nonconvex objectives where saddle points are prevalent. The paper frames a central question about whether acceleration (momentum-based methods) can yield faster rates than gradient descent when the goal is to reach second-order stationary points, which exclude strict saddle points.

The authors then introduce a perturbed accelerated gradient descent algorithm and show it achieves an iteration complexity of \u02dcO(1/\u03b5{7/4}) to find \u03b5-second-order stationary points, thereby answering the posed question affirmatively within this work. This entry records the explicit open status as stated prior to presenting their result.

References

It is open as to whether momentum-based methods yield faster rates in the nonconvex setting, specifically when we consider the convergence criterion of second-order stationarity.

Accelerated Gradient Descent Escapes Saddle Points Faster than Gradient Descent (1711.10456 - Jin et al., 2017) in Section 1 (Introduction)