On the Stability of Learning in Network Games with Many Players (2403.15848v1)
Abstract: Multi-agent learning algorithms have been shown to display complex, unstable behaviours in a wide array of games. In fact, previous works indicate that convergent behaviours are less likely to occur as the total number of agents increases. This seemingly prohibits convergence to stable strategies, such as Nash Equilibria, in games with many players. To make progress towards addressing this challenge we study the Q-Learning Dynamics, a classical model for exploration and exploitation in multi-agent learning. In particular, we study the behaviour of Q-Learning on games where interactions between agents are constrained by a network. We determine a number of sufficient conditions, depending on the game and network structure, which guarantee that agent strategies converge to a unique stable strategy, called the Quantal Response Equilibrium (QRE). Crucially, these sufficient conditions are independent of the total number of agents, allowing for provable convergence in arbitrarily large games. Next, we compare the learned QRE to the underlying NE of the game, by showing that any QRE is an $\epsilon$-approximate Nash Equilibrium. We first provide tight bounds on $\epsilon$ and show how these bounds lead naturally to a centralised scheme for choosing exploration rates, which enables independent learners to learn stable approximate Nash Equilibrium strategies. We validate the method through experiments and demonstrate its effectiveness even in the presence of numerous agents and actions. Through these results, we show that independent learning dynamics may converge to approximate Nash Equilibria, even in the presence of many agents.
- Approximate Consensus in Stochastic Networks with Application to Load Balancing. IEEE Transactions on Information Theory 61, 4 (9 2015), 1739–1752. https://doi.org/10.1109/TIT.2015.2406323
- Who’s Who in Networks. Wanted: The Key Player. Econometrica 74, 5 (2006), 1403–1417. http://www.jstor.org/stable/3805930
- Follow-the-Regularized-Leader Routes to Chaos in Routing Games. (2 2021). http://arxiv.org/abs/2102.07974
- Strategic Interaction and Networks. The American Economic Review 104, 3 (2014), 898–930. http://www.jstor.org/stable/42920723
- Behavioural game theory: Thinking, learning and teaching. Advances in Understanding Strategic Behaviour: Game Theory, Experiments and Bounded Rationality (1 2004), 120–180. https://doi.org/10.1057/9780230523371{_}8/COVER
- Yun Kuen Cheung and Georgios Piliouras. 2019. Vortices Instead of Equilibria in MinMax Optimization: Chaos and Butterfly Effects of Online Learning in Zero-Sum Games. In Proceedings of the Thirty-Second Conference on Learning Theory (Proceedings of Machine Learning Research, Vol. 99), Alina Beygelzimer and Daniel Hsu (Eds.). PMLR, 807–834. https://proceedings.mlr.press/v99/cheung19a.html
- Yun Kuen Cheung and Yixin Tao. 2020. Chaos of Learning Beyond Zero-sum and Coordination via Game Decompositions. (8 2020). http://arxiv.org/abs/2008.00540
- The route to chaos in routing games: Population increase drives period-doubling instability, chaos & inefficiency with Price of Anarchy equal to one. (2019). http://arxiv.org/abs/1906.02486
- The route to chaos in routing games: When is Price of Anarchy too optimistic? (2019). http://arxiv.org/abs/1906.02486
- A payoff-based learning procedure and its application to traffic games. Games and Economic Behavior 70, 1 (2010), 71–83. https://doi.org/10.1016/j.geb.2008.11.012
- Aleksander Czechowski and Georgios Piliouras. 2022. Poincaré-Bendixson Limit Sets in Multi-Agent Learning; Poincaré-Bendixson Limit Sets in Multi-Agent Learning. In International Conference on Autonomous Agents and Multiagent Systems. www.ifaamas.org
- Francisco Facchinei and Jong Shi Pang. 2004. Finite-Dimensional Variational Inequalities and Complementarity Problems. Finite-Dimensional Variational Inequalities and Complementarity Problems (2004). https://doi.org/10.1007/B97543
- Tobias Galla. 2011. Cycles of cooperation and defection in imperfect learning. Journal of Statistical Mechanics: Theory and Experiment 2011, 8 (8 2011). https://doi.org/10.1088/1742-5468/2011/08/P08007
- Tobias Galla and J. Doyne Farmer. 2013. Complex dynamics in learning complicated games. Proceedings of the National Academy of Sciences of the United States of America 110, 4 (2013), 1232–1236. https://doi.org/10.1073/pnas.1109672110
- Sample-Based Approximation of Nash in Large Many-Player Games via Gradient Descent. In Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’22). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 507–515.
- Decentralized Convergence to Nash Equilibria in Constrained Deterministic Mean Field Control. IEEE Trans. Automat. Control 61, 11 (11 2016), 3315–3329. https://doi.org/10.1109/TAC.2015.2513368
- Generalized Hamiltonian Dynamics and Chaos in Evolutionary Games on Networks. Physica A: Statistical Mechanics and its Applications 597 (7 2022).
- Learning in nonatomic games part I Finite action spaces and population games. Journal of Dynamics and Games. 2022 0, 0 (2022), 0. https://doi.org/10.3934/JDG.2022018
- Heiko Hamann. 2018. Swarm Robotics: A Formal Approach. Springer International Publishing. https://doi.org/10.1007/978-3-319-74528-2
- Christopher Harris. 1998. On the Rate of Convergence of Continuous-Time Fictitious Play. Games and Economic Behavior 22, 2 (2 1998), 238–259. https://doi.org/10.1006/game.1997.0582
- P Jean-Jacques Herings and Ronald Peeters. 2010. Homotopy methods to compute equilibria in game theory. Economic Theory 42, 1 (2010), 119–156. https://doi.org/10.1007/s00199-009-0441-5
- MGAN: Training Generative Adversarial Nets with Multiple Generators. In International Conference on Learning Representations.
- Josef Hofbauer and Karl Sigmund. 1998. Evolutionary Games and Population Dynamics. Cambridge University Press. https://doi.org/10.1017/CBO9781139173179
- Abdolhossein Hoorfar and Mehdi Hassani. 2008. Inequalities on the Lambert W function and hyperpower function. J. Inequal. Pure and Appl. Math 9, 2 (2008), 5–9.
- Beyond Strict Competition: Approximate Convergence of Multi Agent Q-Learning Dynamics. In Proceedings of the Thirty-Second International Joint Conference on Artificial Intelligence (IJCAI-23). https://www.ijcai.org/proceedings/2023/0016.pdf
- Asymptotic Convergence and Performance of Multi-Agent Q-Learning Dynamics. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’23). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1578–1586.
- Evolutionary cycles of cooperation and defection. In Proceedings of the National Academy of Sciences of the United States of America. Vol. 102. 10797–10800. https://doi.org/10.1073/pnas.0502589102
- Amit Kadan and Hu Fu. 2021. Exponential Convergence of Gradient Methods in Concave Network Zero-Sum Games. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics) 12458 LNAI (2021), 19–34. https://doi.org/10.1007/978-3-030-67661-2{_}2/FIGURES/3
- Ardeshir Kianercy and Aram Galstyan. 2012. Dynamics of Boltzmann Q learning in two-player two-action games. Physical Review E - Statistical, Nonlinear, and Soft Matter Physics 85, 4 (4 2012), 041145. https://doi.org/10.1103/PhysRevE.85.041145
- Beyond the Nash Equilibrium Barrier. Innovations in Computer Science (2011).
- Stefanos Leonardos and Georgios Piliouras. 2022. Exploration-exploitation in multi-agent learning: Catastrophe theory meets game theory. Artificial Intelligence 304 (2022), 103653. https://doi.org/10.1016/j.artint.2021.103653
- Exploration-Exploitation in Multi-Agent Competition: Convergence with Bounded Rationality. Advances in Neural Information Processing Systems 34 (12 2021), 26318–26331.
- Triple Generative Adversarial Nets. In Advances in Neural Information Processing Systems, I Guyon, U Von Luxburg, S Bengio, H Wallach, R Fergus, S Vishwanathan, and R Garnett (Eds.), Vol. 30. Curran Associates, Inc. https://proceedings.neurips.cc/paper_files/paper/2017/file/86e78499eeb33fb9cac16b7555b50767-Paper.pdf
- J. Maynard Smith. 1974. The theory of games and the evolution of animal conflicts. Journal of Theoretical Biology 47, 1 (9 1974), 209–221. https://doi.org/10.1016/0022-5193(74)90110-6
- Richard D. McKelvey and Thomas R. Palfrey. 1995. Quantal Response Equilibria for Normal Form Games. Games and Economic Behavior 10, 1 (7 1995), 6–38. https://doi.org/10.1006/GAME.1995.1023
- James D. Meiss. 2007. Differential Dynamical Systems. Society for Industrial and Applied Mathematics. https://doi.org/10.1137/1.9780898718232
- Emerson Melo. 2018. A Variational Approach to Network Games. SSRN Electronic Journal (11 2018). https://doi.org/10.2139/SSRN.3143468
- Emerson Melo. 2021. On the Uniqueness of Quantal Response Equilibria and Its Application to Network Games. SSRN Electronic Journal (6 2021). https://doi.org/10.2139/SSRN.3631575
- Cycles in adversarial regularized learning. Proceedings (2018), 2703–2717. https://doi.org/10.1137/1.9781611975031.172
- Panayotis Mertikopoulos and William H. Sandholm. 2016. Learning in Games via Reinforcement and Regularization. https://doi.org/10.1287/moor.2016.0778 41, 4 (8 2016), 1297–1324. https://doi.org/10.1287/MOOR.2016.0778
- Andrew I Metrick and Ben Polak. 1994. Fictitious play in 2 • 2 games: a geometric proof of convergence*. Econ. Theory 4 (1994), 923–933.
- Archan Mukhopadhyay and Sagar Chakraborty. 2020. Deciphering chaos in evolutionary games. Chaos 30, 12 (12 2020), 121104. https://doi.org/10.1063/5.0029480
- Weight of fitness deviation governs strict physical chaos in replicator dynamics. Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 3 (3 2018), 033104. https://doi.org/10.1063/1.5011955
- Best reply structure and equilibrium convergence in generic games. Science Advances 5, 2 (2 2019). https://doi.org/10.1126/SCIADV.AAT1328/SUPPL{_}FILE/AAT1328{_}SM.PDF
- Towards a taxonomy of learning dynamics in 2 × 2 games. Games and Economic Behavior 132 (3 2022), 1–21. https://doi.org/10.1016/J.GEB.2021.11.015
- Distributed convergence to Nash equilibria in network and average aggregative games. Automatica 117 (2020), 108959. https://doi.org/10.1016/j.automatica.2020.108959
- Francesca Parise and Asuman Ozdaglar. 2019. A variational inequality framework for network games: Existence, uniqueness, convergence and sensitivity analysis. Games and Economic Behavior 114 (3 2019), 47–82. https://doi.org/10.1016/j.geb.2018.11.012
- From poincaré recurrence to convergence in imperfect information games: finding equilibrium via regularization. Technical Report.
- Fictitious Play for Mean Field Games: Continuous Time Analysis and Applications. In Advances in Neural Information Processing Systems, H Larochelle, M Ranzato, R Hadsell, M F Balcan, and H Lin (Eds.), Vol. 33. Curran Associates, Inc., 13199–13213. https://proceedings.neurips.cc/paper/2020/file/995ca733e3657ff9f5f3c823d73371e1-Paper.pdf
- J Rosen. 1965. Existence and Uniqueness of Equilibrium Points for Concave N-Person Games. Econometrica 33, 3 (1965).
- The prevalence of chaotic dynamics in games with many players. Scientific Reports 8, 1 (2018), 4902. https://doi.org/10.1038/s41598-018-22013-5
- Stability and diversity in collective adaptation. Physica D: Nonlinear Phenomena 210 (2004), 21–57.
- Chaos in learning a simple two-person game. Proceedings of the National Academy of Sciences of the United States of America 99, 7 (4 2002), 4748–4751. https://doi.org/10.1073/pnas.032086299
- Yuzuru Sato and James P. Crutchfield. 2003. Coupled replicator equations for the dynamics of learning in multiagent systems. Physical Review E 67, 1 (1 2003), 015206. https://doi.org/10.1103/PhysRevE.67.015206
- Howard M. Schwartz. 2014. Multi-Agent Machine Learning: A Reinforcement Approach. Wiley. 1–242 pages. https://doi.org/10.1002/9781118884614
- L. S. Shapley. 2016. Some Topics in Two-Person Games. In Advances in Game Theory. (AM-52). Princeton University Press, 1–28. https://doi.org/10.1515/9781400882014-002
- Mohammad Shokri and Hamed Kebriaei. 2020. Leader-Follower Network Aggregative Game with Stochastic Agents’ Communication and Activeness. IEEE Trans. Automat. Control 65, 12 (12 2020), 5496–5502. https://doi.org/10.1109/TAC.2020.2973807
- Sylvain Sorin and Cheng Wan. 2016. Finite composite games: Equilibria and dynamics. Journal of Dynamics and Games 3, 1 (2016), 101–120.
- R Sutton and A Barto. 2018. Reinforcement Learning: An Introduction. MIT Press. http://incompleteideas.net/book/the-book-2nd.html
- György Szabó and Gábor Fáth. 2007. Evolutionary games on graphs. Physics Reports 446, 4 (2007), 97–216. https://doi.org/10.1016/j.physrep.2007.04.004
- Tatiana Tatarenko and Maryam Kamgarpour. 2019. Learning Nash Equilibria in Monotone Games. Proceedings of the IEEE Conference on Decision and Control 2019-December (12 2019), 3104–3109. https://doi.org/10.1109/CDC40024.2019.9029659
- Theodore L Turocy. 2005. A dynamic homotopy interpretation of the logistic quantal response equilibrium correspondence. Games and Economic Behavior 51, 2 (2005), 243–263. https://doi.org/10.1016/j.geb.2004.04.003
- Karl Tuyls. 2023. Multiagent Learning: From Fundamentals to Foundation Models. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems (AAMAS ’23). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 1.
- An Evolutionary Dynamical Analysis of Multi-Agent Learning in Iterated Games. Autonomous Agents and Multi-Agent Systems 12, 1 (2006), 115–153. https://doi.org/10.1007/s10458-005-3783-9
- Sebastian van Strien and Colin Sparrow. 2011. Fictitious play in 3×3 games: Chaos and dithering behaviour. Games and Economic Behavior 73, 1 (2011), 262–286. https://doi.org/10.1016/j.geb.2010.12.004
- No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix. Advances in Neural Information Processing Systems 33 (2020), 1380–1391.
- Aamal Hussain (5 papers)
- Dan Leonte (6 papers)
- Francesco Belardinelli (40 papers)
- Georgios Piliouras (130 papers)