Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence (1712.00600v1)

Published 2 Dec 2017 in cs.LG, cs.AI, and cs.MA

Abstract: We introduce MAgent, a platform to support research and development of many-agent reinforcement learning. Unlike previous research platforms on single or multi-agent reinforcement learning, MAgent focuses on supporting the tasks and the applications that require hundreds to millions of agents. Within the interactions among a population of agents, it enables not only the study of learning algorithms for agents' optimal polices, but more importantly, the observation and understanding of individual agent's behaviors and social phenomena emerging from the AI society, including communication languages, leaderships, altruism. MAgent is highly scalable and can host up to one million agents on a single GPU server. MAgent also provides flexible configurations for AI researchers to design their customized environments and agents. In this demo, we present three environments designed on MAgent and show emerged collective intelligence by learning from scratch.

MAgent: A Comprehensive Platform for Large-Scale Multi-Agent Reinforcement Learning

The paper "MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence" introduces MAgent, a platform specifically designed to address the challenges associated with many-agent reinforcement learning (RL), particularly in the context of Artificial Collective Intelligence (ACI). This platform supports research endeavors involving hundreds to millions of agents, a scale that is inadequately addressed by existing platforms like ALE, OpenAI Gym, and others traditionally limited to dozens of agents.

Key Features of MAgent

MAgent's scalability is a central feature, enabling simulations of up to one million agents on a single GPU server. This scalability is achieved through innovative techniques such as network sharing and ID embedding. Additionally, MAgent is equipped with a flexible configuration system that allows researchers to tailor environments and agent behaviors to their specific needs. This includes a novel reward description language that facilitates the creation of complex interaction rules among agents.

A noteworthy aspect of MAgent is its gridworld environment, which serves as the foundational framework for agents to operate within. This environment accommodates heterogeneous agents and supports customizable state and action spaces, enabling rapid prototyping and development of various scenarios.

Demonstration Environments

The platform's capabilities are illustrated through three example environments: Pursuit, Gathering, and Battle. In the Pursuit scenario, the emergence of local cooperation is observed as predator agents learn to collaboratively capture prey. The Gathering environment explores competitive dynamics over limited resources, with agents balancing between resource acquisition and strategic elimination of competitors. The Battle scenario showcases complex interactions involving both cooperation and competition, where two agent armies employ sophisticated strategies such as encirclement and guerrilla tactics.

Baseline Algorithms and Interactive Features

MAgent includes implementations of parameter-sharing DQN, DRQN, and A2C, with DQN noted for its superior performance in the tested settings. These baseline algorithms provide a solid foundation for benchmarking new multi-agent algorithms.

The platform also offers a visually effective rendering system, enabling users to interactively observe and manipulate the environment. This includes the ability for human players to directly engage with AI agents, providing valuable insights into agent strategies and behaviors.

Implications and Future Directions

MAgent stands out as a crucial tool for advancing research in many-agent reinforcement learning and ACI. By facilitating the paper of large populations of agents, MAgent provides valuable opportunities for exploring emergent phenomena such as communication systems, leadership structures, and altruism within artificial societies.

Future developments for MAgent include the incorporation of continuous environments and the expansion of available algorithms, which will further enhance the platform's applicability and depth. As research in ACI progresses, MAgent is poised to play a pivotal role in unlocking new insights and fostering advancements in the field of multi-agent systems.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Lianmin Zheng (34 papers)
  2. Jiacheng Yang (11 papers)
  3. Han Cai (79 papers)
  4. Weinan Zhang (322 papers)
  5. Jun Wang (991 papers)
  6. Yong Yu (219 papers)
Citations (197)
Github Logo Streamline Icon: https://streamlinehq.com