Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Beyond Theorems: A Counterexample to Potential Markov Game Criteria (2405.08206v1)

Published 13 May 2024 in cs.GT and cs.MA

Abstract: There are only limited classes of multi-player stochastic games in which independent learning is guaranteed to converge to a Nash equilibrium. Markov potential games are a key example of such classes. Prior work has outlined sets of sufficient conditions for a stochastic game to qualify as a Markov potential game. However, these conditions often impose strict limitations on the game's structure and tend to be challenging to verify. To address these limitations, Mguni et al. [12] introduce a relaxed notion of Markov potential games and offer an alternative set of necessary conditions for categorizing stochastic games as potential games. Under these conditions, the authors claim that a deterministic Nash equilibrium can be computed efficiently by solving a dual Markov decision process. In this paper, we offer evidence refuting this claim by presenting a counterexample.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (17)
  1. Lawrence E Blume. 1995. The statistical mechanics of best-response strategy revision. Games and Economic Behavior 11, 2 (1995), 111–145.
  2. Vivek S Borkar. 2002. Reinforcement learning in Markovian evolutionary games. Advances in Complex Systems 5, 01 (2002), 55–72.
  3. The complexity of Markov equilibrium in stochastic games. In The 36th Annual Conference on Learning Theory. 4180–4234.
  4. Arlington M Fink. 1964. Equilibrium in a stochastic n-person game. Journal of Science of the Hiroshima University, series ai (mathematics) 28, 1 (1964), 89–93.
  5. Learning with Opponent-Learning Awareness. (2018), 122–130.
  6. Stabilising experience replay for deep multi-agent reinforcement learning. In Proceedings of the 34th International Conference on Machine Learning (ICML). 1146–1155.
  7. Independent natural policy gradient always converges in Markov potential games. In Proceedings of the International Conference on Artificial Intelligence and Statistics. 4414–4425.
  8. Decentralized single-timescale actor-critic on zero-sum two-player stochastic games. In Proceedings of the International Conference on Machine Learning (ICML). 3899–3909.
  9. Junling Hu and Michael P Wellman. 2003. Nash Q-learning for general-sum stochastic games. Journal of Machine Learning Research 4 (2003), 1039–1069.
  10. Global convergence of multi-agent policy gradient in Markov potential games. arXiv preprint arXiv:2106.01969 (2021).
  11. Learning parametric closed-loop policies for Markov potential games. arXiv preprint arXiv:1802.00899 (2018).
  12. Learning in nonzero-sum stochastic games with potentials. In Proceedings of the International Conference on Machine Learning (ICML). 7688–7699.
  13. Dov Monderer and Lloyd S Shapley. 1996. Potential games. Games and Economic Behavior 14, 1 (1996), 124–143.
  14. Learning Nash equilibrium for general-sum Markov games from batch data. In Proceedings of the 20th International Conference on Artificial Intelligence and Statistics. 232–241.
  15. Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
  16. Christopher JCH Watkins and Peter Dayan. 1992. Q-learning. Machine Learning 8 (1992), 279–292.
  17. Christopher John Cornish Hellaby Watkins. 1989. Learning from delayed rewards. (1989).

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com