Negative momentum’s convergence and acceleration properties in min–max optimization

Determine whether gradient-descent-ascent augmented with negative momentum can (i) accelerate in unconstrained bilinear min–max problems and in smooth strongly-convex–strongly-concave min–max problems, and (ii) achieve last-iterate convergence in smooth convex–concave min–max problems.

Background

The paper revisits the convergence of gradient-descent-ascent (GDA) by proposing time-varying, asymmetric, and sometimes negative stepsizes (slingshot schedules) that achieve convergence and, in several settings, optimal rates. In contrasting these schedules with existing modifications of GDA, the authors discuss negative momentum—an approach that adds internal dynamics to GDA and has been empirically useful but theoretically less understood in standard smooth min–max regimes.

Within this context, the authors explicitly note that it is unknown whether negative momentum can accelerate in bilinear or strongly-convex–strongly-concave settings, or even ensure convergence in convex–concave settings. Their results demonstrate that slingshot stepsizes achieve these goals, highlighting the gap in understanding for negative momentum and motivating a precise characterization of its capabilities.

References

Moreover, it is unknown if negative momentum can accelerate in bilinear or strongly-convex-strongly-concave settings or can even converge in convex-concave settings, whereas we show that all of this is possible using the slingshot stepsizes.

— Negative Stepsizes Make Gradient-Descent-Ascent Converge (2505.01423 - Shugart et al., 2 May 2025) in Section 1.2 (Slingshot stepsize schedules), paragraph “Three key properties”

Negative momentum’s convergence and acceleration properties in min–max optimization

Background

References

Related Problems