Mava: a research library for distributed multi-agent reinforcement learning in JAX
Abstract: Multi-agent reinforcement learning (MARL) research is inherently computationally expensive and it is often difficult to obtain a sufficient number of experiment samples to test hypotheses and make robust statistical claims. Furthermore, MARL algorithms are typically complex in their design and can be tricky to implement correctly. These aspects of MARL present a difficult challenge when it comes to creating useful software for advanced research. Our criteria for such software is that it should be simple enough to use to implement new ideas quickly, while at the same time be scalable and fast enough to test those ideas in a reasonable amount of time. In this preliminary technical report, we introduce Mava, a research library for MARL written purely in JAX, that aims to fulfill these criteria. We discuss the design and core features of Mava, and demonstrate its use and performance across a variety of environments. In particular, we show Mava's substantial speed advantage, with improvements of 10-100x compared to other popular MARL frameworks, while maintaining strong performance. This allows for researchers to test ideas in a few minutes instead of several hours. Finally, Mava forms part of an ecosystem of libraries that seamlessly integrate with each other to help facilitate advanced research in MARL. We hope Mava will benefit the community and help drive scientifically sound and statistically robust research in the field. The open-source repository for Mava is available at https://github.com/instadeepai/Mava.
- Deep reinforcement learning at the edge of the statistical precipice. Advances in neural information processing systems, 34:29304–29320, 2021.
- A game-theoretic model and best-response learning method for ad hoc coordination in multiagent systems. arXiv preprint arXiv:1506.01170, 2015.
- Efficient online reinforcement learning with offline data. In International Conference on Machine Learning. PMLR, 2023.
- Benchmarl: Benchmarking multi-agent reinforcement learning. arXiv preprint arXiv:2312.01472, 2023.
- Jumanji: a diverse suite of scalable reinforcement learning environments in jax, 2023. URL https://arxiv.org/abs/2306.09884.
- JAX: composable transformations of Python+NumPy programs, 2023. URL http://github.com/google/jax.
- On the utility of learning about humans for human-ai coordination. Advances in neural information processing systems, 32, 2019.
- Shared experience actor-critic for multi-agent reinforcement learning. In Advances in Neural Information Processing Systems (NeurIPS), 2020.
- Off-the-grid marl: Datasets and baselines for offline multi-agent reinforcement learning. In Extended Abstract at the 2023 International Conference on Autonomous Agents and Multiagent Systems. AAMAS, 2023a.
- Reduce, reuse, recycle: Selective reincarnation in multi-agent reinforcement learning. In Workshop on Reincarnating Reinforcement Learning at ICLR 2023, 2023b.
- Brax - a differentiable physics engine for large scale rigid body simulation, 2021. URL http://github.com/google/brax.
- Towards a standardised performance evaluation protocol for cooperative marl. Advances in Neural Information Processing Systems, 35:5510–5521, 2022.
- Podracer architectures for scalable reinforcement learning. arXiv preprint arXiv:2104.06272, 2021.
- Marllib: A scalable and efficient multi-agent reinforcement learning library. Journal of Machine Learning Research, 2023.
- Cleanrl: High-quality single-file implementations of deep reinforcement learning algorithms. Journal of Machine Learning Research, 23(274):1–18, 2022.
- Robert Tjarko Lange. gymnax: A JAX-based reinforcement learning environment library, 2022. URL http://github.com/RobertTLange/gymnax.
- Discovered policy optimisation. Advances in Neural Information Processing Systems, 35:16455–16468, 2022.
- Offline pre-trained multi-agent decision transformer. Machine Intelligence Research, 20(2):233–248, 2023.
- Awac: Accelerating online reinforcement learning with offline datasets. arXiv preprint arXiv:2006.09359, 2020.
- Benchmarking multi-agent deep reinforcement learning algorithms in cooperative tasks. In Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks (NeurIPS), 2021. URL http://arxiv.org/abs/2006.07869.
- Facmac: Factored multi-agent centralised policy gradients. Advances in Neural Information Processing Systems, 34:12208–12221, 2021.
- Arnu Pretorius. Matrax: Matrix games in jax, 2023. URL http://github.com/instadeepai/matrax.
- Jaxmarl: Multi-agent rl environments in jax. arXiv preprint arXiv:2311.10090, 2023.
- The StarCraft Multi-Agent Challenge. CoRR, abs/1902.04043, 2019a.
- The starcraft multi-agent challenge. arXiv preprint arXiv:1902.04043, 2019b.
- Learning from good trajectories in offline multi-agent reinforcement learning. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 37, pages 11672–11680, 2023.
- Flashbax: Streamlining experience replay buffers for reinforcement learning with jax, 2023. URL https://github.com/instadeepai/flashbax/.
- Offline multi-agent reinforcement learning with knowledge distillation. In Advances in Neural Information Processing Systems, volume 35, pages 226–237, 2022.
- Leveraging offline data in online reinforcement learning. In International Conference on Machine Learning, pages 35300–35338. PMLR, 2023.
- Offline multi-agent reinforcement learning with implicit global-to-local value regularization. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
- Omry Yadan. Hydra - a framework for elegantly configuring complex applications. Github, 2019. URL https://github.com/facebookresearch/hydra.
- Believe what you see: Implicit constraint approach for offline multi-agent reinforcement learning. Advances in Neural Information Processing Systems, 34:10299–10312, 2021.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.