TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

Published 25 Nov 2020 in cs.LG, cs.AI, and cs.MA | (2011.12895v2)

Abstract: Competitive Self-Play (CSP) based Multi-Agent Reinforcement Learning (MARL) has shown phenomenal breakthroughs recently. Strong AIs are achieved for several benchmarks, including Dota 2, Glory of Kings, Quake III, StarCraft II, to name a few. Despite the success, the MARL training is extremely data thirsty, requiring typically billions of (if not trillions of) frames be seen from the environment during training in order for learning a high performance agent. This poses non-trivial difficulties for researchers or engineers and prevents the application of MARL to a broader range of real-world problems. To address this issue, in this manuscript we describe a framework, referred to as TLeague, that aims at large-scale training and implements several main-stream CSP-MARL algorithms. The training can be deployed in either a single machine or a cluster of hybrid machines (CPUs and GPUs), where the standard Kubernetes is supported in a cloud native manner. TLeague achieves a high throughput and a reasonable scale-up when performing distributed training. Thanks to the modular design, it is also easy to extend for solving other multi-agent problems or implementing and verifying MARL algorithms. We present experiments over StarCraft II, ViZDoom and Pommerman to show the efficiency and effectiveness of TLeague. The code is open-sourced and available at https://github.com/tencent-ailab/tleague_projpage

Abstract PDF Upgrade to Chat

Citations (19)

View on Semantic Scholar

Summary

The paper presents TLeague, a scalable framework that enables high-throughput distributed multi-agent reinforcement learning with competitive self-play.
Its modular design and integration with cloud-native tools like Kubernetes and Horovod allow flexible and efficient deployment across diverse simulation environments.
Experimental results in StarCraft II, ViZDoom, and Pommerman validate TLeague’s competitive performance and extensibility for advancing MARL research.

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

The paper "TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning" presents an infrastructure designed to efficiently handle the demanding computational requirements of Competitive Self-Play (CSP) in Multi-Agent Reinforcement Learning (MARL). Developed by researchers at Tencent Robotics X and Tsinghua University, TLeague addresses the significant data demands of MARL by enabling large-scale, distributed training.

Key Contributions

The paper outlines several critical contributions to the field of MARL:

Scalability and Modular Design: TLeague supports both single and cluster environments, utilizing Kubernetes for cloud-native deployment. This flexibility is critical for scaling up the training processes, which often require extensive computational resources, including tens of thousands of CPU cores and hundreds of GPUs.
High Throughput Distributed Training: With its ability to maximize the utilization of hybrid machines (comprising CPUs and GPUs), TLeague achieves high throughput, making large-scale MARL experiments feasible. The use of Horovod for synchronous gradient updates further optimizes resource utilization.
Extensibility and Versatility: TLeague's modular design facilitates easy extension for new multi-agent problems and supports mainstream CSP-MARL algorithms. This adaptability makes it a robust choice for researchers looking to deploy MARL in diverse environments and applications.

Numerical Results and Experiments

The paper discusses experiments carried out using TLeague on environments like StarCraft II, ViZDoom, and Pommerman, showcasing its efficiency and effectiveness. For instance:

StarCraft II: Demonstrated in the context of the zerg-vs-zerg full game, highlighting the framework's ability to handle complex, strategic video games.
ViZDoom: The framework successfully trained agents that outperformed both built-in bots and existing champions in the ViZDoom Competition track, establishing TLeague's competitive advantage.
Pommerman: In the NeurIPS 2018 competition environment for 2vs2, TLeague-trained agents achieved superior performance metrics, further proving the framework's capability in handling cooperative-competitive environments.

Theoretical Implications

The implementation of Fictitious Self-Play (FSP) within TLeague is aligned with Nash Equilibrium finding in game theory, thereby providing a theoretically sound basis for the training process. This approach addresses commonly encountered issues in MARL, such as non-stationary dynamics and policy forgetting, by leveraging opponent sampling strategies.

Practical Implications and Future Directions

From a practical perspective, TLeague's architecture is well-suited for industries and applications where MARL solutions are applicable but previously hindered by computational constraints. Future work could explore the application of this framework to even broader scales and more complex problems, such as strategic military simulations or real-world robotics challenges.

In conclusion, TLeague represents a robust and scalable solution for CSP-MARL, addressing both theoretical and practical challenges in today's AI research landscape. Its open-source nature and design flexibility make it a valuable tool for advancing the frontiers of multi-agent interactions and learning algorithms.

Markdown

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

Authors (8)

Collections

GitHub

GitHub - tencent-ailab/tleague_projpage (148 stars)

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

Summary

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

Key Contributions

Numerical Results and Experiments

Theoretical Implications

Practical Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (8)

Collections

GitHub

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

Summary

TLeague: A Framework for Competitive Self-Play based Distributed Multi-Agent Reinforcement Learning

Key Contributions

Numerical Results and Experiments

Theoretical Implications

Practical Implications and Future Directions

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (8)

Collections

GitHub

Don't miss out on important new AI/ML research

Sign up for free to explore the frontiers of research