Leveraging Team Correlation for Approximating Equilibrium in Two-Team Zero-Sum Games (2403.00255v1)
Abstract: Two-team zero-sum games are one of the most important paradigms in game theory. In this paper, we focus on finding an unexploitable equilibrium in large team games. An unexploitable equilibrium is a worst-case policy, where members in the opponent team cannot increase their team reward by taking any policy, e.g., cooperatively changing to other joint policies. As an optimal unexploitable equilibrium in two-team zero-sum games, correlated-team maxmin equilibrium remains unexploitable even in the worst case where players in the opponent team can achieve arbitrary cooperation through a joint team policy. However, finding such an equilibrium in large games is challenging due to the impracticality of evaluating the exponentially large number of joint policies. To solve this problem, we first introduce a general solution concept called restricted correlated-team maxmin equilibrium, which solves the problem of being impossible to evaluate all joint policy by a sample factor while avoiding an exploitation problem under the incomplete joint policy evaluation. We then develop an efficient sequential correlation mechanism, and based on which we propose an algorithm for approximating the unexploitable equilibrium in large games. We show that our approach achieves lower exploitability than the state-of-the-art baseline when encountering opponent teams with different exploitation ability in large team games including Google Research Football.
- Algorithms and complexity for computing Nash equilibria in adversarial team games. arXiv preprint arXiv:2301.02129, 2023.
- Team-maxmin equilibrium: Efficiency bounds and algorithms. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 31, 2017.
- Superhuman AI for heads-up no-limit poker: Libratus beats top professionals. Science, 359(6374):418–424, 2018.
- Computational results for extensive-form adversarial team games. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 32, 2018.
- Online double oracle. arXiv preprint arXiv:2103.07780, 2021.
- The rating of chessplayers: Past and present. (No Title), 1978.
- Trust region policy optimisation in multi-agent reinforcement learning. In International Conference on Learning Representations, 2022.
- Google research football: A novel reinforcement learning environment. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pp. 4501–4510, 2020.
- Littman, M. L. Markov games as a framework for multi-agent reinforcement learning. In Machine learning proceedings 1994, pp. 157–163. Morgan Kaufmann, 1994.
- Team-PSRO for learning approximate tmecor in large team games via cooperative reinforcement learning. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Nash, J. Non-cooperative games. Annals of Mathematics, pp. 286–295, 1951.
- The Starcraft multi-agent challenge. arXiv preprint arXiv:1902.04043, 2019.
- Stackelberg security games: Looking beyond a decade of success. IJCAI, 2018.
- Boosting studies of multi-agent reinforcement learning on Google research football environment: the past, present, and future. arXiv preprint arXiv:2309.12951, 2023.
- Trust region bounds for decentralized PPO under non-stationarity. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, pp. 5–13, 2023.
- Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782, 2017.
- Team-maxmin equilibria. Games and Economic Behavior, 21(1-2):309–321, 1997.
- Multi-agent reinforcement learning is a sequence modeling problem. Advances in Neural Information Processing Systems, 35:16509–16521, 2022.
- DOTA: A large-scale dataset for object detection in aerial images. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 3974–3983, 2018.
- Fictitious cross-play: Learning global Nash equilibrium in mixed cooperative-competitive games. In Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, pp. 1053–1061, 2023.
- The surprising effectiveness of PPO in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35:24611–24624, 2022.
- Converging to team-maxmin equilibria in zero-sum multiplayer games. In International Conference on Machine Learning, pp. 11033–11043, 2020.
- Computing optimal Nash equilibria in multiplayer games. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Magent: A many-agent reinforcement learning platform for artificial collective intelligence. In Proceedings of the AAAI conference on artificial intelligence, number 1, 2018.