Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Generalized Optimistic Methods for Convex-Concave Saddle Point Problems (2202.09674v2)

Published 19 Feb 2022 in math.OC, cs.LG, and stat.ML

Abstract: The optimistic gradient method has seen increasing popularity for solving convex-concave saddle point problems. To analyze its iteration complexity, a recent work [arXiv:1906.01115] proposed an interesting perspective that interprets this method as an approximation to the proximal point method. In this paper, we follow this approach and distill the underlying idea of optimism to propose a generalized optimistic method, which includes the optimistic gradient method as a special case. Our general framework can handle constrained saddle point problems with composite objective functions and can work with arbitrary norms using Bregman distances. Moreover, we develop a backtracking line search scheme to select the step sizes without knowledge of the smoothness coefficients. We instantiate our method with first-, second- and higher-order oracles and give best-known global iteration complexity bounds. For our first-order method, we show that the averaged iterates converge at a rate of $O(1/N)$ when the objective function is convex-concave, and it achieves linear convergence when the objective is strongly-convex-strongly-concave. For our second- and higher-order methods, under the additional assumption that the distance-generating function has Lipschitz gradient, we prove a complexity bound of $O(1/\epsilon\frac{2}{p+1})$ in the convex-concave setting and a complexity bound of $O((L_pD\frac{p-1}{2}/\mu)\frac{2}{p+1}+\log\log\frac{1}{\epsilon})$ in the strongly-convex-strongly-concave setting, where $L_p$ ($p\geq 2$) is the Lipschitz constant of the $p$-th-order derivative, $\mu$ is the strong convexity parameter, and $D$ is the initial Bregman distance to the saddle point. Moreover, our line search scheme provably only requires a constant number of calls to a subproblem solver per iteration on average, making our first- and second-order methods particularly amenable to implementation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. “Optimal methods for higher-order smooth monotone variational inequalities” In arXiv preprint arXiv:2205.06167, 2022
  2. M Marques Alves and Benar F Svaiter “A search-free O⁢(1/k3/2)𝑂1superscript𝑘32{O}(1/k^{3/2})italic_O ( 1 / italic_k start_POSTSUPERSCRIPT 3 / 2 end_POSTSUPERSCRIPT ) homotopy inexact proximal-Newton extragradient algorithm for monotone variational inequalities” In arXiv preprint arXiv:2308.05887, 2023
  3. K.J. Arrow, L. Hurwicz and H. Uzawa “Studies in Linear and Non-Linear Programming”, Stanford Mathematical Studies in the Social Sciences Stanford, CA: Stanford University Press, 1958
  4. “Interior Projection-Like Methods for Monotone Variational Inequalities” In Mathematical Programming 104.1 Springer ScienceBusiness Media LLC, 2005, pp. 39–68
  5. “A Tight and Unified Analysis of Gradient-Based Methods for a Whole Spectrum of Differentiable Games” In Proceedings of the 23rd International Conference on Artificial Intelligence and Statistics (AISTATS) 108 PMLR, 2020, pp. 2863–2873
  6. Tamer Başar and Geert Jan Olsder “Dynamic Noncooperative Game Theory, 2nd Edition” SIAM, 1998
  7. Heinz H. Bauschke, Jérôme Bolte and Marc Teboulle “A Descent Lemma beyond Lipschitz Gradient Continuity: First-Order Methods Revisited and Applications” In Mathematics of Operations Research 42.2 Institute for Operations Researchthe Management Sciences (INFORMS), 2017, pp. 330–348
  8. Amir Beck “First-Order Methods in Optimization” SIAM, 2017
  9. Brian Bullins and Kevin A Lai “Higher-order methods for convex-concave min-max optimization and monotone variational inequalities” In SIAM Journal on Optimization 32.3 SIAM, 2022, pp. 2208–2229
  10. Coralia Cartis, Nicholas I.M. Gould and Philippe L. Toint “Adaptive Cubic Regularisation Methods for Unconstrained Optimization. Part I: Motivation, Convergence and Numerical Results” In Mathematical Programming 127.2 Springer ScienceBusiness Media LLC, 2011, pp. 245–295
  11. “A First-Order Primal-Dual Algorithm for Convex Problems With Applications to Imaging” In Journal of Mathematical Imaging and Vision 40.1 Springer ScienceBusiness Media LLC, 2011, pp. 120–145
  12. “On the Ergodic Convergence Rates of a First-Order Primal–Dual Algorithm” In Mathematical Programming 159.1-2 Springer ScienceBusiness Media LLC, 2016, pp. 253–287
  13. “Convergence Analysis of a Proximal-Like Minimization Algorithm Using Bregman Functions” In SIAM Journal on Optimization 3.3 Society for Industrial & Applied Mathematics (SIAM), 1993, pp. 538–543
  14. Yunmei Chen, Guanghui Lan and Yuyuan Ouyang “Optimal Primal-Dual Methods for a Class of Saddle Point Problems” In SIAM Journal on Optimization 24.4, 2014, pp. 1779–1814
  15. “Online Optimization with Gradual Variations” In Proceedings of the 25th Annual Conference on Learning Theory (COLT) 23, 2012, pp. 61–620
  16. Laurent Condat “A Primal-Dual Splitting Method for Convex Optimization Involving Lipschitzian, Proximable and Linear Composite Terms” In Journal of Optimization Theory and Applications 158.2, 2013, pp. 460–479
  17. “Training GANs with Optimism” In Proceedings of International Conference on Learning Representations (ICLR), 2018
  18. Jonathan Eckstein “Nonlinear Proximal Point Algorithms Using Bregman Functions, with Applications to Convex Programming” In Mathematics of Operations Research 18.1 Institute for Operations Researchthe Management Sciences (INFORMS), 1993, pp. 202–226
  19. “Finite-Dimensional Variational Inequalities and Complementarity Problems” New York: Springer-Verlag, 2003
  20. Alireza Fallah, Asuman Ozdaglar and Sarath Pattathil “An Optimal Multistage Stochastic Gradient Method for Minimax Problems” In Proceedings of the 59th IEEE Conference on Decision and Control (CDC) IEEE, 2020
  21. “A Variational Inequality Perspective on Generative Adversarial Networks” In Proceedings of International Conference on Learning Representations (ICLR), 2019
  22. Erfan Yazdandoost Hamedani and Necdet Serhat Aybat “A Primal-Dual Algorithm with Line Search for General Convex-Concave Saddle Point Problems” In SIAM Journal on Optimization 31.2, 2021, pp. 1299–1329
  23. Niao He, Anatoli Juditsky and Arkadi Nemirovski “Mirror Prox Algorithm for Multi-Term Composite Minimization and Semi-Separable Problems” In Computational Optimization and Applications 61.2 Springer ScienceBusiness Media LLC, 2015, pp. 275–319
  24. “On the Convergence of Single-Call Stochastic Extra-Gradient Methods” In Advances in Neural Information Processing Systems 32, 2019
  25. Kevin Huang, Junyu Zhang and Shuzhong Zhang “Cubic regularized Newton method for the saddle point models: A global and local convergence analysis” In J. Sci. Comput. 91.60 Springer, 2022, pp. 1–31
  26. “An approximation-based regularized extra-gradient method for monotone variational inequalities” In arXiv preprint arXiv:2210.04440, 2022
  27. Pooria Joulani, András György and Csaba Szepesvári “A modular analysis of adaptive (non-) convex optimization: Optimism, composite objectives, variance reduction, and variational bounds” In Theor. Comput. Sci. 808 Elsevier, 2020, pp. 108–138
  28. G. Korpelevich “The Extragradient Method for Finding Saddle Points and Other Problems” In Russian; English translation in Matekon In Ekonomika i Matematicheskie Metody 12, 1976, pp. 747–756
  29. Georgios Kotsalis, Guanghui Lan and Tianjiao Li “Simple and optimal methods for stochastic variational inequalities, I: operator extrapolation” In SIAM J. Optim. 32.3 SIAM, 2022, pp. 2041–2073
  30. “Interaction Matters: A Note on Non-Asymptotic Local Convergence of Generative Adversarial Networks” In Proceedings of the 22nd International Conference on Artificial Intelligenceand Statistics (AISTATS) 89 PMLR, 2019, pp. 907–915
  31. “Perseus: A simple high-order regularization method for variational inequalities” In arXiv preprint arXiv:2205.03202, 2022
  32. Tianyi Lin, Panayotis Mertikopoulos and Michael Jordan “Explicit second-order min-max optimization methods with optimal convergence guarantee” In arXiv preprint arXiv:2210.12860, 2022
  33. Haihao Lu, Robert M. Freund and Yurii Nesterov “Relatively Smooth Convex Optimization by First-Order Methods, and Applications” In SIAM Journal on Optimization 28.1 Society for Industrial & Applied Mathematics (SIAM), 2018, pp. 333–354
  34. Yu Malitsky “Proximal Extrapolated Gradient Methods for Variational Inequalities” In Optimization Methods and Software 33.1 Informa UK Limited, 2017, pp. 140–164
  35. Yu. Malitsky “Projected Reflected Gradient Methods for Monotone Variational Inequalities” In SIAM Journal on Optimization 25.1 Society for Industrial & Applied Mathematics (SIAM), 2015, pp. 502–520
  36. “A First-Order Primal-Dual Algorithm with Linesearch” In SIAM Journal on Optimization 28.1, 2018, pp. 411–432
  37. Yura Malitsky and Matthew K. Tam “A Forward-Backward Splitting Method for Monotone Inclusions without Cocoercivity” In SIAM Journal on Optimization 30.2 Society for Industrial & Applied Mathematics (SIAM), 2020, pp. 1451–1472
  38. B. Martinet “Brève Communication. Régularisation D’inéquations Variationnelles Par Approximations Successives” In ESIAM Mathematical Modelling and Numerical Analysis 4.R3 EDP Sciences, 1970, pp. 154–158
  39. “Optimistic mirror descent in saddle-point problems: Going the extra (gradient) mile” In International Conference on Learning Representations (ICLR), 2018
  40. Aryan Mokhtari, Asuman Ozdaglar and Sarath Pattathil “A Unified Analysis of Extra-Gradient and Optimistic Gradient Methods for Saddle Point Problems: Proximal Point Approach” In Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics (AISTATS) 108 PMLR, 2020, pp. 1497–1507
  41. Aryan Mokhtari, Asuman E. Ozdaglar and Sarath Pattathil “Convergence Rate of 𝒪⁢(1/k)𝒪1𝑘\mathcal{O}(1/k)caligraphic_O ( 1 / italic_k ) for Optimistic Gradient and Extragradient Methods in Smooth Convex-Concave Saddle Point Problems” In SIAM Journal on Optimization 30.4 Society for Industrial & Applied Mathematics (SIAM), 2020, pp. 3230–3251
  42. Renato D.C. Monteiro and B.F. Svaiter “Complexity of Variants of Tseng’s Modified F-B Splitting and Korpelevich’s Methods for Hemivariational Inequalities with Applications to Saddle-Point and Convex Optimization Problems” In SIAM Journal on Optimization 21.4 Society for Industrial & Applied Mathematics (SIAM), 2011, pp. 1688–1720
  43. Renato D.C. Monteiro and B.F. Svaiter “On the Complexity of the Hybrid Proximal Extragradient Method for the Iterates and the Ergodic Mean” In SIAM Journal on Optimization 20.6 Society for Industrial & Applied Mathematics (SIAM), 2010, pp. 2755–2787
  44. Renato D.C. Monteiro and Benar F. Svaiter “Iteration-Complexity of a Newton Proximal Extragradient Method for Monotone Variational Inequalities and Inclusion Problems” In SIAM Journal on Optimization 22.3 Society for Industrial & Applied Mathematics (SIAM), 2012, pp. 914–935
  45. Arkadi Nemirovski “Prox-Method with Rate of Convergence O⁢(1/t)𝑂1𝑡{O}(1/t)italic_O ( 1 / italic_t ) for Variational Inequalities with Lipschitz Continuous Monotone Operators and Smooth Convex-concave Saddle Point Problems” In SIAM Journal on Optimization 15.1 Society for Industrial & Applied Mathematics (SIAM), 2004, pp. 229–251
  46. Yu. Nesterov “Accelerating the Cubic Regularization of Newton’s Method on Convex Problems” In Mathematical Programming 112.1 Springer ScienceBusiness Media LLC, 2008, pp. 159–181
  47. Yurii Nesterov “Cubic Regularization of Newton’s Method for Convex Problems with Constraints” In CORE Discussion Paper No. 2006/39, 2006
  48. Yurii Nesterov “Dual Extrapolation and Its Applications to Solving Variational Inequalities and Related Problems” In Mathematical Programming 109.2-3 Springer ScienceBusiness Media LLC, 2007, pp. 319–344
  49. Yurii Nesterov “Implementable Tensor Methods in Unconstrained Convex Optimization” In Mathematical Programming Springer ScienceBusiness Media LLC, 2019
  50. “Cubic Regularization of Newton Method and Its Global Performance” In Mathematical Programming 108.1 Springer ScienceBusiness Media LLC, 2006, pp. 177–205
  51. “Solving Strongly Monotone Variational and Quasi-Variational Inequalities” In Discrete & Continuous Dynamical Systems - A 31.4 American Institute of Mathematical Sciences (AIMS), 2011, pp. 1383–1396
  52. “Tensor Methods for Strongly Convex Strongly Concave Saddle Point Problems and Strongly Monotone Variational Inequalities” In arXiv preprint arXiv:2012.15595, 2020
  53. “Training GANs with Centripetal Acceleration” In Optimization Methods and Software 35.5 Informa UK Limited, 2020, pp. 955–973
  54. L.D. Popov “A Modification of the Arrow-Hurwicz Method for Search of Saddle Points” In Mathematical Notes of the Academy of Sciences of the USSR 28.5 Springer ScienceBusiness Media LLC, 1980, pp. 845–848
  55. “Online Learning with Predictable Sequences” In Proceedings of the 26th Annual Conference on Learning Theory (COLT) 30 PMLR, 2013, pp. 993–1019
  56. “Optimization, Learning, and Games with Predictable Sequences” In Advances in Neural Information Processing Systems 26, 2013
  57. R.Tyrrell Rockafellar “Monotone Operators and the Proximal Point Algorithm” In SIAM Journal on Control and Optimization 14.5 Society for Industrial & Applied Mathematics (SIAM), 1976, pp. 877–898
  58. “Large-Scale Convex Optimization: Algorithms & Analyses via Monotone Operators” Cambridge University Press, 2022
  59. “A Hybrid Approximate Extragradient – Proximal Point Algorithm Using the Enlargement of a Maximal Monotone Operator” In Set-Valued Analysis 7.4 Springer ScienceBusiness Media LLC, 1999, pp. 323–345
  60. Benar Fux Svaiter “Complexity of the relaxed hybrid proximal-extragradient method under the large-step condition” In arXiv preprint arXiv:2303.04972, 2023
  61. Paul Tseng “A Modified Forward-Backward Splitting Method for Maximal Monotone Mappings” In SIAM Journal on Control and Optimization 38.2 Society for Industrial & Applied Mathematics (SIAM), 2000, pp. 431–446
  62. Paul Tseng “On Accelerated Proximal Gradient Methods for Convex-Concave Optimization” In submitted to SIAM J. Optim., 2008
  63. Paul Tseng “On Linear Convergence of Iterative Methods for the Variational Inequality Problem” In Journal of Computational and Applied Mathematics 60.1-2 Elsevier BV, 1995, pp. 237–252
  64. “Fast Distributionally Robust Learning with Variance-Reduced Min-Max Optimization” In Proceedings of The 25th International Conference on Artificial Intelligence and Statistics (AISTATS) 151 PMLR, 2022, pp. 1219–1250
  65. “Stochastic Primal-Dual Coordinate Method for Regularized Empirical Risk Minimization” In International Conference on Machine Learning (ICML) 37 PMLR, 2015, pp. 353–361
Citations (30)

Summary

We haven't generated a summary for this paper yet.