Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

LEAD: Min-Max Optimization from a Physical Perspective (2010.13846v4)

Published 26 Oct 2020 in cs.LG, cs.GT, cs.MA, and math.OC

Abstract: Adversarial formulations such as generative adversarial networks (GANs) have rekindled interest in two-player min-max games. A central obstacle in the optimization of such games is the rotational dynamics that hinder their convergence. In this paper, we show that game optimization shares dynamic properties with particle systems subject to multiple forces, and one can leverage tools from physics to improve optimization dynamics. Inspired by the physical framework, we propose LEAD, an optimizer for min-max games. Next, using Lyapunov stability theory and spectral analysis, we study LEAD's convergence properties in continuous and discrete time settings for a class of quadratic min-max games to demonstrate linear convergence to the Nash equilibrium. Finally, we empirically evaluate our method on synthetic setups and CIFAR-10 image generation to demonstrate improvements in GAN training.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (62)
  1. Last-iterate convergence rates for min-max optimization. arXiv preprint arXiv:1906.02027, 2019.
  2. A tight and unified analysis of gradient-based methods for a whole spectrum of differentiable games. In International Conference on Artificial Intelligence and Statistics, pp.  2863–2873, 2020a.
  3. Accelerating smooth games by manipulating spectral shapes. International Conference on Artificial Intelligence and Statistics, 2020b.
  4. Multi-agent learning in network zero-sum games is a hamiltonian system. In Proceedings of the 18th International Conference on Autonomous Agents and MultiAgent Systems, pp.  233–241. International Foundation for Autonomous Agents and Multiagent Systems, 2019.
  5. deepmind-symplectic-gradient-adjustment, 2018a. https://github.com/deepmind/symplectic-gradient-adjustment/blob/master/Symplectic_Gradient_Adjustment.ipynb.
  6. The mechanics of n-player differentiable games. In ICML, volume 80, pp.  363–372. JMLR. org, 2018b.
  7. MV Berry and Pragya Shukla. Curl force dynamics: symmetries, chaos and constants of motion. New Journal of Physics, 18(6):063018, 2016.
  8. Dimitri P Bertsekas. Nonlinear programming. Athena scientific Belmont, 1999.
  9. Large scale gan training for high fidelity natural image synthesis. arXiv preprint arXiv:1809.11096, 2018.
  10. A comprehensive survey of multiagent reinforcement learning. IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(2):156–172, 2008.
  11. Taming gans with lookahead. arXiv preprint arXiv:2006.14567, 2020.
  12. Training gans with optimism. In International Conference on Learning Representations, 2018.
  13. Gradient descent-ascent provably converges to strict local minmax equilibria with a finite timescale separation. arXiv preprint arXiv:2009.14820, 2020.
  14. Local convergence analysis of gradient descent ascent with finite timescale separation. In Proceedings of the International Conference on Learning Representation, 2021.
  15. Learning with opponent-learning awareness. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems, pp.  122–130. International Foundation for Autonomous Agents and Multiagent Systems, 2018.
  16. Global convergence to the equilibrium of gans using variational inequalities. arXiv preprint arXiv:1808.01531, 2018.
  17. A variational inequality perspective on generative adversarial networks. In International Conference on Learning Representations, 2018.
  18. Negative momentum for improved game dynamics. In The 22nd International Conference on Artificial Intelligence and Statistics, pp.  1802–1811, 2019.
  19. Last iterate is slower than averaged iterate in smooth convex-concave saddle point problems. arXiv preprint arXiv:2002.00057, 2020.
  20. Generative adversarial nets. In Advances in neural information processing systems, pp. 2672–2680, 2014.
  21. Improved training of wasserstein gans. In Advances in neural information processing systems, pp. 5767–5777, 2017.
  22. Theory and application of Liapunov’s direct method. Prentice-Hall Englewood Cliffs, NJ, 1963.
  23. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems, pp. 6626–6637, 2017.
  24. The limits of min-max optimization algorithms: convergence to spurious non-critical sets. arXiv preprint arXiv:2006.09065, 2020.
  25. Linear lower bounds and conditioning of differentiable games. In International conference on machine learning, 2020.
  26. Rebooting acgan: Auxiliary classifier gans with stable training. Advances in Neural Information Processing Systems, 34:23505–23518, 2021.
  27. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  28. GM Korpelevich. The extragradient method for finding saddle points and other problems. Matecon, 12:747–756, 1976.
  29. Learning multiple layers of features from tiny images. Technical report, Citeseer, 2009.
  30. LD Landau and EM Lifshitz. Course of theoretical physics. vol. 1: Mechanics. Oxford, 1960.
  31. Vitgan: Training gans with vision transformers. arXiv preprint arXiv:2107.04589, 2021.
  32. Differentiable game mechanics. Journal of Machine Learning Research, 20(84):1–40, 2019.
  33. Stochastic hamiltonian gradient methods for smooth games. ICML, 2020.
  34. Complex momentum for optimization in games. In International Conference on Artificial Intelligence and Statistics, pp.  7742–7765. PMLR, 2022.
  35. Aleksandr Mikhailovich Lyapunov. The general problem of the stability of motion. International journal of control, 55(3):531–534, 1992.
  36. Hamiltonian descent methods. arXiv preprint arXiv:1809.05042, 2018.
  37. On finding local nash equilibria (and only local nash equilibria) in zero-sum games. arXiv preprint arXiv:1901.00838, 2019.
  38. Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile. arXiv preprint arXiv:1807.02629, 2018.
  39. The numerics of gans. In Advances in Neural Information Processing Systems, pp. 1825–1835, 2017.
  40. Spectral normalization for generative adversarial networks. arXiv preprint arXiv:1802.05957, 2018.
  41. A dynamical systems perspective on nesterov acceleration. In International Conference on Machine Learning, pp. 4656–4662, 2019.
  42. Gradient descent gan optimization is locally stable. In Advances in neural information processing systems, pp. 5585–5595, 2017.
  43. Yurii Nesterov. Introductory lectures on convex optimization: A basic course, volume 87. Springer Science & Business Media, 2013.
  44. Solving a class of non-convex min-max games using iterative first order methods. In Advances in Neural Information Processing Systems, pp. 14934–14942, 2019.
  45. s. 2017.
  46. Boris T Polyak. Some methods of speeding up the convergence of iteration methods. USSR Computational Mathematics and Mathematical Physics, 4(5):1–17, 1964.
  47. Training generative adversarial networks by solving ordinary differential equations. arXiv preprint arXiv:2010.15040, 2020.
  48. Unsupervised representation learning with deep convolutional generative adversarial networks. arXiv preprint arXiv:1511.06434, 2015.
  49. Discretization drift in two-player games. In International Conference on Machine Learning, pp. 9064–9074. PMLR, 2021.
  50. Ode analysis of stochastic gradient methods with optimism and anchoring for minimax problems and gans. arXiv preprint arXiv:1905.10899, 2019.
  51. Improved techniques for training gans. Advances in neural information processing systems, 29, 2016.
  52. Stylegan-xl: Scaling stylegan to large diverse datasets. In ACM SIGGRAPH 2022 Conference Proceedings, pp.  1–10, 2022.
  53. Competitive gradient descent. In Advances in Neural Information Processing Systems, pp. 7623–7633, 2019.
  54. Multi-task learning as multi-objective optimization. In Advances in Neural Information Processing Systems, pp. 527–538, 2018.
  55. Acceleration via symplectic discretization of high-resolution differential equations. In Advances in Neural Information Processing Systems, pp. 5745–5753, 2019.
  56. A differential equation for modeling nesterov’s accelerated gradient method: Theory and insights. In Advances in Neural Information Processing Systems, pp. 2510–2518, 2014.
  57. On solving minimax optimization locally: A follow-the-ridge approach. In International Conference on Learning Representations, 2019.
  58. A variational perspective on accelerated methods in optimization. Proceedings of the National Academy of Sciences, 113(47):E7351–E7358, 2016.
  59. A lyapunov analysis of momentum methods in optimization. arXiv preprint arXiv:1611.02635, 2016.
  60. The unusual effectiveness of averaging in gan training, 2019.
  61. On the suboptimality of negative momentum for minimax optimization. In International Conference on Artificial Intelligence and Statistics, pp.  2098–2106. PMLR, 2021.
  62. Near-optimal local convergence of alternating gradient descent-ascent for minimax optimization. In International Conference on Artificial Intelligence and Statistics, pp.  7659–7679. PMLR, 2022.

Summary

We haven't generated a summary for this paper yet.