Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

QuadSwarm: A Modular Multi-Quadrotor Simulator for Deep Reinforcement Learning with Direct Thrust Control (2306.09537v1)

Published 15 Jun 2023 in cs.RO, cs.AI, cs.LG, cs.MA, cs.SY, and eess.SY

Abstract: Reinforcement learning (RL) has shown promise in creating robust policies for robotics tasks. However, contemporary RL algorithms are data-hungry, often requiring billions of environment transitions to train successful policies. This necessitates the use of fast and highly-parallelizable simulators. In addition to speed, such simulators need to model the physics of the robots and their interaction with the environment to a level acceptable for transferring policies learned in simulation to reality. We present QuadSwarm, a fast, reliable simulator for research in single and multi-robot RL for quadrotors that addresses both issues. QuadSwarm, with fast forward-dynamics propagation decoupled from rendering, is designed to be highly parallelizable such that throughput scales linearly with additional compute. It provides multiple components tailored toward multi-robot RL, including diverse training scenarios, and provides domain randomization to facilitate the development and sim2real transfer of multi-quadrotor control policies. Initial experiments suggest that QuadSwarm achieves over 48,500 simulation samples per second (SPS) on a single quadrotor and over 62,000 SPS on eight quadrotors on a 16-core CPU. The code can be found in https://github.com/Zhehui-Huang/quad-swarm-rl.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (13)
  1. Y. Xie, M. Lu, R. Peng, and P. Lu, “Learning agile flights through narrow gaps with varying angles using onboard sensing,” arXiv preprint arXiv:2302.11233, 2023.
  2. A. Molchanov, T. Chen, W. Hönig, J. A. Preiss, N. Ayanian, and G. S. Sukhatme, “Sim-to-(multi)-real: Transfer of low-level robust control policies to multiple quadrotors,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2019, pp. 59–66.
  3. S. Batra, Z. Huang, A. Petrenko, T. Kumar, A. Molchanov, and G. S. Sukhatme, “Decentralized control of quadrotor swarms with end-to-end deep reinforcement learning,” in Conference on Robot Learning.   PMLR, 2022, pp. 576–586.
  4. S. Shah, D. Dey, C. Lovett, and A. Kapoor, “Airsim: High-fidelity visual and physical simulation for autonomous vehicles,” in Field and Service Robotics: Results of the 11th International Conference.   Springer, 2018, pp. 621–635.
  5. S. Krishnan, B. Boroujerdian, W. Fu, A. Faust, and V. J. Reddi, “Air learning: a deep reinforcement learning gym for autonomous aerial robot visual navigation,” Machine Learning, vol. 110, pp. 2501–2540, 2021.
  6. W. Koch, R. Mancuso, R. West, and A. Bestavros, “Reinforcement learning for uav attitude control,” ACM Transactions on Cyber-Physical Systems, vol. 3, no. 2, pp. 1–21, 2019.
  7. Y. Song, S. Naji, E. Kaufmann, A. Loquercio, and D. Scaramuzza, “Flightmare: A flexible quadrotor simulator,” in Conference on Robot Learning.   PMLR, 2021, pp. 1147–1157.
  8. J. Panerati, H. Zheng, S. Zhou, J. Xu, A. Prorok, and A. P. Schoellig, “Learning to fly—a gym environment with pybullet physics for reinforcement learning of multi-agent quadcopter control,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 7512–7519.
  9. G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.
  10. G. Shi, W. Hönig, Y. Yue, and S.-J. Chung, “Neural-swarm: Decentralized close-proximity multirotor control using learned interactions,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 3241–3247.
  11. A. Petrenko, Z. Huang, T. Kumar, G. Sukhatme, and V. Koltun, “Sample factory: Egocentric 3d control from pixels at 100000 fps with asynchronous reinforcement learning,” in International Conference on Machine Learning.   PMLR, 2020, pp. 7652–7662.
  12. S. K. Lam, A. Pitrou, and S. Seibert, “Numba: A llvm-based python jit compiler,” in Proceedings of the Second Workshop on the LLVM Compiler Infrastructure in HPC, 2015, pp. 1–6.
  13. J. Hu, S. Jiang, S. A. Harding, H. Wu, and S.-w. Liao, “Rethinking the implementation tricks and monotonicity constraint in cooperative multi-agent reinforcement learning,” arXiv preprint arXiv:2102.03479, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Zhehui Huang (10 papers)
  2. Sumeet Batra (9 papers)
  3. Tao Chen (397 papers)
  4. Rahul Krupani (2 papers)
  5. Tushar Kumar (4 papers)
  6. Artem Molchanov (11 papers)
  7. Aleksei Petrenko (11 papers)
  8. James A. Preiss (11 papers)
  9. Zhaojing Yang (4 papers)
  10. Gaurav S. Sukhatme (88 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.