Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
133 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Paths to Equilibrium in Games (2403.18079v2)

Published 26 Mar 2024 in cs.GT, cs.AI, and cs.LG

Abstract: In multi-agent reinforcement learning (MARL) and game theory, agents repeatedly interact and revise their strategies as new data arrives, producing a sequence of strategy profiles. This paper studies sequences of strategies satisfying a pairwise constraint inspired by policy updating in reinforcement learning, where an agent who is best responding in one period does not switch its strategy in the next period. This constraint merely requires that optimizing agents do not switch strategies, but does not constrain the non-optimizing agents in any way, and thus allows for exploration. Sequences with this property are called satisficing paths, and arise naturally in many MARL algorithms. A fundamental question about strategic dynamics is such: for a given game and initial strategy profile, is it always possible to construct a satisficing path that terminates at an equilibrium? The resolution of this question has implications about the capabilities or limitations of a class of MARL algorithms. We answer this question in the affirmative for normal-form games. Our analysis reveals a counterintuitive insight that reward deteriorating strategic updates are key to driving play to equilibrium along a satisficing path.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (51)
  1. Decentralized Q-learning for stochastic teams and games. IEEE Transactions on Automatic Control, 62(4):1545–1558, 2017.
  2. Yakov Babichenko. Completely uncoupled dynamics and Nash equilibria. Games and Economic Behavior, 76(1):1–14, 2012.
  3. Nash equilibria and pitfalls of adversarial training in adversarial robustness games. In International Conference on Artificial Intelligence and Statistics, pages 9607–9636. PMLR, 2023.
  4. Fictitious play and best-response dynamics in identical interest and zero-sum stochastic games. In International Conference on Machine Learning, pages 1664–1690. PMLR, 2022.
  5. Stochastic approximations and differential inclusions. SIAM Journal on Control and Optimization, 44(1):328–348, 2005.
  6. Lawrence E Blume. The statistical mechanics of strategic interaction. Games and Economic Behavior, 5(3):387–424, 1993.
  7. Adversarial example games. Advances in Neural Information Processing Systems, 33:8921–8934, 2020.
  8. Bandit learning in concave n-person games. Advances in Neural Information Processing Systems, 31, 2018.
  9. George W Brown. Iterative solution of games by fictitious play. Act. Anal. Prod Allocation, 13(1):374, 1951.
  10. Near-potential games: Geometry and dynamics. ACM Transactions on Economics and Computation (TEAC), 1(2):1–32, 2013.
  11. Prediction, learning, and games. Cambridge University Press, 2006.
  12. Aspiration learning in coordination games. SIAM Journal on Control and Optimization, 51(1):465–490, 2013.
  13. Convergence to approximate Nash equilibria in congestion games. Games and Economic Behavior, 71(2):315–327, 2011.
  14. The complexity of computing a Nash equilibrium. Communications of the ACM, 52(2):89–97, 2009.
  15. On learning algorithms for Nash equilibria. In Algorithmic Game Theory: Third International Symposium, SAGT 2010, Athens, Greece, October 18-20, 2010. Proceedings 3, pages 114–125. Springer, 2010.
  16. On the structure of weakly acyclic games. In Algorithmic Game Theory: Third International Symposium, SAGT 2010, Athens, Greece, October 18-20, 2010. Proceedings 3, pages 126–137. Springer, 2010.
  17. No-regret learning and mixed Nash equilibria: They do not mix. Advances in Neural Information Processing Systems, 33:1380–1391, 2020.
  18. Regret testing: Learning to play Nash equilibrium without knowing you have an opponent. Theoretical Economics, 1(3):341–367, 2006.
  19. Global Nash convergence of Foster and Young’s regret testing. Games and Economic Behavior, 60(1):135–154, 2007.
  20. Generative adversarial networks. Communications of the ACM, 63(11):139–144, 2020.
  21. Explaining and harnessing adversarial examples. arXiv preprint arXiv:1412.6572, 2014.
  22. Multi-agent deep reinforcement learning: a survey. Artificial Intelligence Review, pages 1–49, 2022.
  23. Uncoupled dynamics do not lead to Nash equilibrium. American Economic Review, 93(5):1830–1836, 2003.
  24. Stochastic uncoupled dynamics and Nash equilibrium. Games and economic behavior, 57(2):286–303, 2006.
  25. A practical guide to multi-objective reinforcement learning and planning. Autonomous Agents and Multi-Agent Systems, 36(1):26, 2022.
  26. On the global convergence of stochastic fictitious play. Econometrica, 70(6):2265–2294, 2002.
  27. Adaptive learning in continuous games: Optimal regret bounds and convergence to Nash equilibrium. In Conference on Learning Theory, pages 2388–2422. PMLR, 2021.
  28. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research, 4(Nov):1039–1069, 2003.
  29. On no-regret learning, fictitious play, and Nash equilibrium. In ICML, volume 1, pages 226–233, 2001.
  30. Individual Q-learning in normal form games. SIAM Journal on Control and Optimization, 44(2):495–514, 2005.
  31. Generalised weakened fictitious play. Games and Economic Behavior, 56(2):285–298, 2006.
  32. Yulong Lu. Two-scale gradient descent ascent dynamics finds mixed Nash equilibria of continuous games: A mean-field perspective. In International Conference on Machine Learning, pages 22790–22811. PMLR, 2023.
  33. Revisiting log-linear learning: Asynchrony, completeness and payoff-based implementation. Games and Economic Behavior, 75(2):788–808, 2012.
  34. Payoff-based dynamics for multiplayer weakly acyclic games. SIAM Journal on Control and Optimization, 48(1):373–396, 2009.
  35. Game theory. Cambridge University Press, 2020.
  36. An impossibility theorem in game dynamics. Proceedings of the National Academy of Sciences, 120(41):e2305349120, 2023.
  37. A 2×\times× 2game without the fictitious play property. Games and Economic Behavior, 14(1):144–148, 1996.
  38. Fictitious play property for games with identical interests. Journal of Economic Theory, 68(1):258–265, 1996a.
  39. Potential games. Games and Economic Behavior, 14(1):124–143, 1996b.
  40. Game theory and multi-agent reinforcement learning. Reinforcement Learning: State-of-the-Art, pages 441–470, 2012.
  41. Best reply structure and equilibrium convergence in generic games. Science Advances, 5(2):eaat1328, 2019.
  42. Julia Robinson. An iterative method of solving a game. Annals of mathematics, pages 296–301, 1951.
  43. Fictitious play in zero-sum stochastic games. SIAM Journal on Control and Optimization, 60(4):2095–2114, 2022a.
  44. Fictitious play in Markov games with single controller. In Proceedings of the 23rd ACM Conference on Economics and Computation, pages 919–936, 2022b.
  45. Lloyd Shapley. Some topics in two-person games. Advances in Game Theory, 52:1–29, 1964.
  46. Nash convergence of gradient dynamics in general-sum games. In UAI, pages 541–548, 2000.
  47. Distributed inertial best-response dynamics. IEEE Transactions on Automatic Control, 63(12):4294–4300, 2018a.
  48. On best-response dynamics in potential games. SIAM Journal on Control and Optimization, 56(4):2734–2767, 2018b.
  49. Decentralized learning for optimality in stochastic dynamic teams and games with local control and global state information. IEEE Transactions on Automatic Control, 67(10):5230–5245, 2022.
  50. Satisficing paths and independent multiagent reinforcement learning in stochastic games. SIAM Journal on Mathematics of Data Science, 5(3):745–773, 2023.
  51. Multi-agent reinforcement learning: A selective overview of theories and algorithms. Handbook of reinforcement learning and control, pages 321–384, 2021.

Summary

We haven't generated a summary for this paper yet.