On Tractable $Φ$-Equilibria in Non-Concave Games (2403.08171v2)
Abstract: While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both. Non-concave games introduce significant game-theoretic and optimization challenges: (i) Nash equilibria may not exist; (ii) local Nash equilibria, though existing, are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria generally have infinite support and are intractable. To sidestep these challenges, we revisit the classical solution concept of $\Phi$-equilibria introduced by Greenwald and Jafari [2003], which is guaranteed to exist for an arbitrary set of strategy modifications $\Phi$ even in non-concave games [Stoltz and Lugosi, 2007]. However, the tractability of $\Phi$-equilibria in such games remains elusive. In this paper, we initiate the study of tractable $\Phi$-equilibria in non-concave games and examine several natural families of strategy modifications. We show that when $\Phi$ is finite, there exists an efficient uncoupled learning algorithm that converges to the corresponding $\Phi$-equilibria. Additionally, we explore cases where $\Phi$ is infinite but consists of local modifications, showing that Online Gradient Descent can efficiently approximate $\Phi$-equilibria in non-trivial regimes.
- Last-iterate convergence rates for min-max optimization: Convergence of hamiltonian gradient descent and consensus optimization. In Algorithmic Learning Theory, pages 3–47. PMLR, 2021.
- Learning in non-convex games with an optimization oracle. In Conference on Learning Theory, pages 18–29. PMLR, 2019.
- Near-optimal no-regret learning for correlated equilibria in multi-player general-sum games. In Proceedings of the 54th Annual ACM SIGACT Symposium on Theory of Computing (STOC), 2022a.
- Uncoupled learning dynamics with o(logt)𝑜𝑡o(\log t)italic_o ( roman_log italic_t ) swap regret in multiplayer games. In Advances in Neural Information Processing Systems (NeurIPS), 2022b.
- Near-optimal phi-regret learning in extensive-form games. In International Conference on Machine Learning (ICML), 2023.
- Existence of an equilibrium for a competitive economy. Econometrica, pages 265–290, 1954.
- The nonstochastic multiarmed bandit problem. SIAM journal on computing, 32(1):48–77, 2002.
- Dynamic local regret for non-convex online forecasting. Advances in neural information processing systems, 32, 2019.
- Efficient phi-regret minimization in extensive-form games via online mirror descent. Advances in Neural Information Processing Systems, 35:22313–22325, 2022.
- Constrained phi-equilibria. In International Conference on Machine Learning, 2023.
- David Blackwell. An analog of the minimax theorem for vector payoffs. Pacific Journal of Mathematics, 6(1):1–8, January 1956. ISSN 0030-8730. URL https://projecteuclid.org/journals/pacific-journal-of-mathematics/volume-6/issue-1/An-analog-of-the-minimax-theorem-for-vector-payoffs/pjm/1103044235.full. Publisher: Pacific Journal of Mathematics, A Non-profit Corporation.
- George W Brown. Iterative solution of games by fictitious play. Act. Anal. Prod Allocation, 13(1):374, 1951.
- Sébastien Bubeck et al. Convex optimization: Algorithms and complexity. Foundations and Trends® in Machine Learning, 8(3-4):231–357, 2015.
- Accelerated single-call methods for constrained min-max optimization. International Conference on Learning Representations (ICLR), 2023.
- Prediction, learning, and games. Cambridge university press, 2006.
- Xi Chen and Binghui Peng. Hedging in games: Faster convergence of external and swap regrets. Advances in Neural Information Processing Systems (NeurIPS), 33:18990–18999, 2020.
- Settling the complexity of computing two-player nash equilibria. Journal of the ACM (JACM), 56(3):1–57, 2009.
- From External to Swap Regret 2.0: An Efficient Reduction and Oblivious Adversary for Large Action Spaces, December 2023. URL http://arxiv.org/abs/2310.19786. arXiv:2310.19786 [cs].
- Constantinos Daskalakis. Non-concave games: A challenge for game theory’s next 100 years. Cowles Preprints, 2022.
- The limit points of (optimistic) gradient descent in min-max optimization. In the 32nd Annual Conference on Neural Information Processing Systems (NeurIPS), 2018.
- The complexity of computing a nash equilibrium. Communications of the ACM, 52(2):89–97, 2009.
- Near-optimal no-regret learning in general games. Advances in Neural Information Processing Systems (NeurIPS), 2021a.
- The complexity of constrained min-max optimization. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing (STOC), 2021b.
- Stay-on-the-ridge: Guaranteed convergence to local minimax equilibrium in nonconvex-nonconcave games. In The Thirty Sixth Annual Conference on Learning Theory, pages 5146–5198. PMLR, 2023.
- Gerard Debreu. A social equilibrium existence theorem. Proceedings of the National Academy of Sciences, 38(10):886–893, 1952.
- Efficient methods for structured nonconvex-nonconcave min-max optimization. International Conference on Artificial Intelligence and Statistics, 2021.
- Ky Fan. Minimax theorems. Proceedings of the National Academy of Sciences, 39(1):42–47, 1953.
- Near-optimal no-regret learning dynamics for general convex games. Advances in Neural Information Processing Systems, 35:39076–39089, 2022a.
- Simple uncoupled no-regret learning dynamics for extensive-form correlated equilibrium. Journal of the ACM, 69(6), 2022b.
- Local convergence analysis of gradient descent ascent with finite timescale separation. In Proceedings of the International Conference on Learning Representation, 2021.
- Implicit learning dynamics in stackelberg games: Equilibria characterization, convergence analysis, and empirical study. In International Conference on Machine Learning, pages 3133–3144. PMLR, 2020.
- Online learning with non-convex losses and non-stationary regret. In International Conference on Artificial Intelligence and Statistics, pages 235–243. PMLR, 2018.
- George B. Dantzig. Linear Programming and Extensions. Princeton University Press, 1963.
- Irving L Glicksberg. A further generalization of the kakutani fixed theorem, with application to nash equilibrium points. Proceedings of the American Mathematical Society, 3(1):170–174, 1952.
- No-regret learning in convex games. In Proceedings of the 25th international conference on Machine learning, pages 360–367, 2008.
- A general class of no-regret learning algorithms and game-theoretic equilibria. In Learning Theory and Kernel Machines: 16th Annual Conference on Learning Theory and 7th Kernel Workshop, COLT/Kernel 2003, Washington, DC, USA, August 24-27, 2003. Proceedings, pages 2–12. Springer, 2003.
- Online nonconvex optimization with limited instantaneous oracle feedback. In The Thirty Sixth Annual Conference on Learning Theory, pages 3328–3355. PMLR, 2023.
- Regret minimization in stochastic non-convex learning via a proximal-gradient approach. In International Conference on Machine Learning, pages 4008–4017. PMLR, 2021.
- James Hannan. Approximation to bayes risk in repeated play. Contributions to the Theory of Games, 3:97–139, 1957.
- Computational Equivalence of Fixed Points and No Regret Algorithms, and Convergence to Equilibria. In Advances in Neural Information Processing Systems, volume 20. Curran Associates, Inc., 2007. URL https://proceedings.neurips.cc/paper/2007/hash/e4bb4c5173c2ce17fd8fcd40041c068f-Abstract.html.
- Efficient regret minimization in non-convex games. In International Conference on Machine Learning, pages 1433–1441. PMLR, 2017.
- Online non-convex optimization with imperfect feedback. Advances in Neural Information Processing Systems, 33:17224–17235, 2020.
- What is local optimality in nonconvex-nonconcave minimax optimization? In International conference on machine learning (ICML), pages 4880–4889. PMLR, 2020.
- Samuel Karlin. Mathematical methods and theory in games, programming, and economics: Volume II: the theory of infinite games, volume 2. Addision-Wesley, 1959.
- Samuel Karlin. Mathematical Methods and Theory in Games, Programming, and Economics: Volume 2: The Theory of Infinite Games. Elsevier, 2014.
- The hedge algorithm on a continuum. In International Conference on Machine Learning, pages 824–832. PMLR, 2015.
- Online learning in adversarial lipschitz environments. In Joint european conference on machine learning and knowledge discovery in databases, pages 305–320. Springer, 2010.
- Greedy adversarial equilibrium: an efficient alternative to nonconvex-nonconcave min-max optimization. In Proceedings of the 53rd Annual ACM SIGACT Symposium on Theory of Computing, pages 896–909, 2021.
- On gradient-based learning in continuous games. SIAM Journal on Mathematics of Data Science, 2(1):103–131, 2020.
- Lionel McKenzie. On equilibrium in graham’s model of world trade and other competitive systems. Econometrica, pages 147–161, 1954.
- Learning in games with continuous action sets and unknown payoff functions. Mathematical Programming, 173:465–507, 2019.
- Hindsight and sequential rationality of correlated play. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 5584–5594, 2021a.
- Efficient deviation types and learning for hindsight rationality in extensive-form games. In International Conference on Machine Learning, pages 7818–7828. PMLR, 2021b.
- Some NP-complete problems in quadratic and nonlinear programming. Mathematical Programming, 39(2):117–129, June 1987. ISSN 1436-4646. doi: 10.1007/BF02592948. URL https://doi.org/10.1007/BF02592948.
- John F Nash Jr. Equilibrium points in n-person games. Proceedings of the national academy of sciences, 36(1):48–49, 1950.
- Fast swap regret minimization and applications to approximate correlated equilibria, November 2023. URL http://arxiv.org/abs/2310.19647. arXiv:2310.19647 [cs].
- Escaping limit cycles: Global convergence for constrained nonconvex-nonconcave minimax problems. In International Conference on Learning Representations (ICLR), 2022.
- Evolutionary dynamics and phi-regret minimization in games. Journal of Artificial Intelligence Research, 74:1125–1158, 2022.
- Online learning: Beyond regret. In Proceedings of the 24th Annual Conference on Learning Theory, pages 559–594. JMLR Workshop and Conference Proceedings, 2011.
- Optimization, learning, and games with predictable sequences. Advances in Neural Information Processing Systems, 2013.
- On the characterization of local nash equilibria in continuous games. IEEE transactions on automatic control, 61(8):2301–2307, 2016.
- Julia Robinson. An iterative method of solving a game. Annals of mathematics, pages 296–301, 1951.
- J Ben Rosen. Existence and uniqueness of equilibrium points for concave n-person games. Econometrica, pages 520–534, 1965.
- Maurice Sion. On general minimax theorems. Pacific J. Math., 8(4):171–176, 1958.
- Sample-efficient learning of correlated equilibria in extensive-form games. Advances in Neural Information Processing Systems, 35:4099–4110, 2022.
- Learning correlated equilibria in games with compact sets of strategies. Games and Economic Behavior, 59(1):187–208, April 2007. ISSN 08998256. doi: 10.1016/j.geb.2006.04.007. URL https://linkinghub.elsevier.com/retrieve/pii/S0899825606000911.
- Online non-convex learning: Following the perturbed leader is optimal. In Algorithmic Learning Theory, pages 845–861. PMLR, 2020.
- Fast convergence of regularized learning in games. Advances in Neural Information Processing Systems (NeurIPS), 2015.
- J v. Neumann. Zur theorie der gesellschaftsspiele. Mathematische annalen, 100(1):295–320, 1928.
- Extensive-form correlated equilibrium: Definition and computational complexity. Mathematics of Operations Research, 33(4):1002–1022, 2008.
- On solving minimax optimization locally: A follow-the-ridge approach. In International Conference on Learning Representations (ICLR), 2020.
- Martin Zinkevich. Online convex programming and generalized infinitesimal gradient ascent. In Proceedings of the 20th international conference on machine learning (ICML), 2003.