Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Universal Gradient Descent Ascent Method for Nonconvex-Nonconcave Minimax Optimization (2212.12978v5)

Published 26 Dec 2022 in math.OC, cs.LG, and stat.ML

Abstract: Nonconvex-nonconcave minimax optimization has received intense attention over the last decade due to its broad applications in machine learning. Most existing algorithms rely on one-sided information, such as the convexity (resp. concavity) of the primal (resp. dual) functions, or other specific structures, such as the Polyak-\L{}ojasiewicz (P\L{}) and Kurdyka-\L{}ojasiewicz (K\L{}) conditions. However, verifying these regularity conditions is challenging in practice. To meet this challenge, we propose a novel universally applicable single-loop algorithm, the doubly smoothed gradient descent ascent method (DS-GDA), which naturally balances the primal and dual updates. That is, DS-GDA with the same hyperparameters is able to uniformly solve nonconvex-concave, convex-nonconcave, and nonconvex-nonconcave problems with one-sided K\L{} properties, achieving convergence with $\mathcal{O}(\epsilon{-4})$ complexity. Sharper (even optimal) iteration complexity can be obtained when the K\L{} exponent is known. Specifically, under the one-sided K\L{} condition with exponent $\theta\in(0,1)$, DS-GDA converges with an iteration complexity of $\mathcal{O}(\epsilon{-2\max{2\theta,1}})$. They all match the corresponding best results in the literature. Moreover, we show that DS-GDA is practically applicable to general nonconvex-nonconcave problems even without any regularity conditions, such as the P\L{} condition, K\L{} condition, or weak Minty variational inequalities condition. For various challenging nonconvex-nonconcave examples in the literature, including Forsaken'',Bilinearly-coupled minimax'', Sixth-order polynomial'', andPolarGame'', the proposed DS-GDA can all get rid of limit cycles. To the best of our knowledge, this is the first first-order algorithm to achieve convergence on all of these formidable problems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Taoli Zheng (5 papers)
  2. Linglingzhi Zhu (10 papers)
  3. Anthony Man-Cho So (97 papers)
  4. Jose Blanchet (143 papers)
  5. Jiajin Li (26 papers)
Citations (10)

Summary

We haven't generated a summary for this paper yet.