Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
156 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

XuanCe: A Comprehensive and Unified Deep Reinforcement Learning Library (2312.16248v1)

Published 25 Dec 2023 in cs.LG, cs.AI, and cs.DL

Abstract: In this paper, we present XuanCe, a comprehensive and unified deep reinforcement learning (DRL) library designed to be compatible with PyTorch, TensorFlow, and MindSpore. XuanCe offers a wide range of functionalities, including over 40 classical DRL and multi-agent DRL algorithms, with the flexibility to easily incorporate new algorithms and environments. It is a versatile DRL library that supports CPU, GPU, and Ascend, and can be executed on various operating systems such as Ubuntu, Windows, MacOS, and EulerOS. Extensive benchmarks conducted on popular environments including MuJoCo, Atari, and StarCraftII multi-agent challenge demonstrate the library's impressive performance. XuanCe is open-source and can be accessed at https://github.com/agi-brain/xuance.git.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (32)
  1. Tensorflow: a system for large-scale machine learning. In Osdi, volume 16, pages 265–283. Savannah, GA, USA, 2016.
  2. Joshua Achiam. Spinning Up in Deep Reinforcement Learning. 2018.
  3. Openai gym. arXiv preprint arXiv:1606.01540, 2016.
  4. Dopamine: A Research Framework for Deep Reinforcement Learning. 2018. URL http://arxiv.org/abs/1812.06110.
  5. Mushroomrl: Simplifying reinforcement learning research. Journal of Machine Learning Research, 22(131):1–5, 2021. URL http://jmlr.org/papers/v22/18-056.html.
  6. Rlzoo: A comprehensive and adaptive reinforcement learning library. arXiv preprint arXiv:2009.08644, 2020.
  7. Addressing function approximation error in actor-critic methods. In International conference on machine learning, pages 1587–1596. PMLR, 2018.
  8. Chainerrl: A deep reinforcement learning library. The Journal of Machine Learning Research, 22(1):3557–3570, 2021.
  9. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning, pages 1861–1870. PMLR, 2018.
  10. Marllib: A scalable and efficient library for multi-agent reinforcement learning. Journal of Machine Learning Research, 24:1–23, 2023.
  11. Ltd. Huawei Technologies Co. Huawei mindspore ai development framework. In Artificial Intelligence Technology, pages 137–162. Springer, 2022.
  12. Or-gym: A reinforcement learning library for operations research problems. arXiv preprint arXiv:2008.06319, 2020.
  13. Google research football: A novel reinforcement learning environment. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 4501–4510, 2020.
  14. Deep learning. nature, 521(7553):436–444, 2015.
  15. Rllib: Abstractions for distributed reinforcement learning. In International Conference on Machine Learning, pages 3053–3062. PMLR, 2018.
  16. Finrl: Deep reinforcement learning framework to automate trading in quantitative finance. In Proceedings of the Second ACM International Conference on AI in Finance, pages 1–9, 2021.
  17. Human-level control through deep reinforcement learning. nature, 518(7540):529–533, 2015.
  18. Fabio Pardo. Tonic: A deep reinforcement learning library for fast prototyping and benchmarking. arXiv preprint arXiv:2011.07537, 2020.
  19. Pytorch: An imperative style, high-performance deep learning library. Advances in neural information processing systems, 32, 2019.
  20. Language models are unsupervised multitask learners. OpenAI blog, 1(8):9, 2019.
  21. Weighted qmix: Expanding monotonic value function factorisation for deep multi-agent reinforcement learning. Advances in neural information processing systems, 33:10199–10210, 2020a.
  22. Monotonic value function factorisation for deep multi-agent reinforcement learning. The Journal of Machine Learning Research, 21(1):7234–7284, 2020b.
  23. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347, 2017.
  24. d3rlpy: An offline deep reinforcement learning library. The Journal of Machine Learning Research, 23(1):14205–14224, 2022.
  25. skrl: Modular and flexible library for reinforcement learning. Journal of Machine Learning Research, 24(254):1–9, 2023.
  26. A general reinforcement learning algorithm that masters chess, shogi, and go through self-play. Science, 362(6419):1140–1144, 2018.
  27. Value-decomposition multi-agent actor-critics. In Proceedings of the AAAI conference on artificial intelligence, volume 35, pages 11352–11360, 2021.
  28. Value-decomposition networks for cooperative multi-agent learning. arXiv preprint arXiv:1706.05296, 2017.
  29. Starcraft ii: A new challenge for reinforcement learning. arXiv preprint arXiv:1708.04782, 2017.
  30. Tianshou: A highly modularized deep reinforcement learning library. Journal of Machine Learning Research, 23(267):1–6, 2022.
  31. The surprising effectiveness of ppo in cooperative multi-agent games. Advances in Neural Information Processing Systems, 35:24611–24624, 2022.
  32. Magent: A many-agent reinforcement learning platform for artificial collective intelligence. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
Citations (2)

Summary

  • The paper presents XuanCe, a unified library that streamlines both DRL and MARL research with support for over 40 algorithms.
  • The paper demonstrates versatile integration with PyTorch, TensorFlow, and MindSpore across diverse hardware and operating systems.
  • The paper validates XuanCe’s efficacy with benchmarks on environments like MuJoCo, Atari, and StarCraftII, showcasing its competitive performance.

Overview of XuanCe: A Deep Reinforcement Learning Library

The paper presents XuanCe, a sophisticated deep reinforcement learning (DRL) library that offers integration with major deep learning frameworks such as PyTorch, TensorFlow, and MindSpore. The primary aim of XuanCe is to address the heterogeneous nature of DRL algorithms and environments, providing a unified platform that simplifies the development and evaluation of both traditional single-agent DRL and multi-agent reinforcement learning (MARL) techniques. The library is open-source and supports a range of hardware and operating system platforms, thus providing flexibility and adaptability for users in diverse computing environments.

Key Features

XuanCe includes more than 40 algorithms across the realms of DRL and MARL, offering compatibility for multiple deep learning frameworks. This breadth of algorithmic content is an essential strength, as it provides researchers with a comprehensive toolkit for exploring a wide variety of DRL applications. The library's architecture is modular, allowing easier integration and testing of new algorithms and environments.

  1. Versatility: Compatible with PyTorch, TensorFlow, and MindSpore, XuanCe facilitates the deployment of DRL models on CPUs, GPUs, and Ascend hardware across operating systems like Ubuntu, Windows, MacOS, and EulerOS.
  2. Algorithmic Diversity: Supporting over 40 algorithms, XuanCe offers a toolbox spanning value-based, policy-based, and MARL algorithms, covering a wide swath of potential applications from simple control tasks to complex multi-agent scenarios.
  3. Comprehensive Benchmarks: The library has been verified with benchmarks on common environments such as MuJoCo, Atari, and the StarCraftII multi-agent challenge. These benchmarks illustrate the library’s competitiveness when compared to results published with other DRL research.

Design and Implementation

XuanCe's design comprises four primary components:

  • Configs: Utilizing YAML files for hyper-parameter tuning and flexible environment and model configurations.
  • Common Tools: Provides tools for preparation and initialization of models pre-training, alongside memory utilities for experience replay, crucial for off-policy learning strategies.
  • Environments and Algorithms: Beyond supporting popular tasks, the library enhances sample efficiency through parallel environments. Algorithmically, XuanCe organizes its DRL capabilities into five unified modules: utils, representations, policies, learners, agents, and runners.

Related Work

The paper positions XuanCe in context with other existing DRL libraries, highlighting its combination of widespread algorithm support, modular structure, and support for multiple frameworks as distinguishing features. Libraries such as RLlib and others often focus on subsets of DRL strategies or specific framework supports, whereas XuanCe aims to provide a unified and holistic approach.

Conclusion and Implications

XuanCe is a robust and feature-rich DRL library ideal for advanced DRL research and application development. Its versatility across frameworks and environments makes it an invaluable tool for researchers aiming to experiment rapidly across a spectrum of scenarios. The availability of extensive benchmarking means users can trust its performance relative to existing solutions. Future developments may include extending its algorithm base or further optimizing its adaptability in diverse operational settings, effectively broadening the exploration potential within the AI community.

Github Logo Streamline Icon: https://streamlinehq.com
X Twitter Logo Streamline Icon: https://streamlinehq.com