High-probability complexity guarantees for nonconvex minimax problems (2405.14130v3)

Published 23 May 2024 in math.OC

Abstract: Stochastic smooth nonconvex minimax problems are prevalent in machine learning, e.g., GAN training, fair classification, and distributionally robust learning. Stochastic gradient descent ascent (GDA)-type methods are popular in practice due to their simplicity and single-loop nature. However, there is a significant gap between the theory and practice regarding high-probability complexity guarantees for these methods on stochastic nonconvex minimax problems. Existing high-probability bounds for GDA-type single-loop methods only apply to convex/concave minimax problems and to particular non-monotone variational inequality problems under some restrictive assumptions. In this work, we address this gap by providing the first high-probability complexity guarantees for nonconvex/PL minimax problems corresponding to a smooth function that satisfies the PL-condition in the dual variable. Specifically, we show that when the stochastic gradients are light-tailed, the smoothed alternating GDA method can compute an $\varepsilon$-stationary point within $O(\frac{\ell \kappa² \delta^{2}{\varepsilon^4}} + \frac{\kappa}{\varepsilon^{2}(\ell+\delta^{2\log({1}/{\bar{q}})))$}} stochastic gradient calls with probability at least $1-\bar{q}$ for any $\bar{q}\in(0,1)$, where $\mu$ is the PL constant, $\ell$ is the Lipschitz constant of the gradient, $\kappa=\ell/\mu$ is the condition number, and $\delta^2$ denotes a bound on the variance of stochastic gradients. We also present numerical results on a nonconvex/PL problem with synthetic data and on distributionally robust optimization problems with real data, illustrating our theoretical findings.

References (71)

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Tweets

https://twitter.com/MGurbuzbalaban/status/1795279310858985919

High-probability complexity guarantees for nonconvex minimax problems (2405.14130v3)

Summary

Related Papers

Tweets