Multi-UAV Formation Control with Static and Dynamic Obstacle Avoidance via Reinforcement Learning (2410.18495v2)
Abstract: This paper tackles the challenging task of maintaining formation among multiple unmanned aerial vehicles (UAVs) while avoiding both static and dynamic obstacles during directed flight. The complexity of the task arises from its multi-objective nature, the large exploration space, and the sim-to-real gap. To address these challenges, we propose a two-stage reinforcement learning (RL) pipeline. In the first stage, we randomly search for a reward function that balances key objectives: directed flight, obstacle avoidance, formation maintenance, and zero-shot policy deployment. The second stage applies this reward function to more complex scenarios and utilizes curriculum learning to accelerate policy training. Additionally, we incorporate an attention-based observation encoder to improve formation maintenance and adaptability to varying obstacle densities. Experimental results in both simulation and real-world environments demonstrate that our method outperforms both planning-based and RL-based baselines in terms of collision-free rates and formation maintenance across static, dynamic, and mixed obstacle scenarios. Ablation studies further confirm the effectiveness of our curriculum learning strategy and attention-based encoder. Animated demonstrations are available at: https://sites.google.com/view/ uav-formation-with-avoidance/.
- Y. Liu and G. Nejat, “Multirobot cooperative learning for semiautonomous control in urban search and rescue applications,” Journal of Field Robotics, vol. 33, no. 4, pp. 512–536, 2016.
- N. Rao, S. Sundaram, and P. Jagtap, “Temporal waypoint navigation of multi-uav payload system using barrier functions,” in 2023 European Control Conference (ECC). IEEE, 2023, pp. 1–6.
- G.-P. Liu and S. Zhang, “A survey on formation control of small satellites,” Proceedings of the IEEE, vol. 106, no. 3, pp. 440–457, 2018.
- Z. Lin, W. Ding, G. Yan, C. Yu, and A. Giua, “Leader–follower formation via complex laplacian,” Automatica, vol. 49, no. 6, pp. 1900–1906, 2013.
- M. C. De Gennaro and A. Jadbabaie, “Formation control for a cooperative multi-agent system using decentralized navigation functions,” in 2006 American Control Conference. IEEE, 2006, pp. 6–pp.
- M. Dawood, S. Pan, N. Dengler, S. Zhou, A. P. Schoellig, and M. Bennewitz, “Safe multi-agent reinforcement learning for formation control without individual reference targets,” arXiv preprint arXiv:2312.12861, 2023.
- Y. Yan, X. Li, X. Qiu, J. Qiu, J. Wang, Y. Wang, and Y. Shen, “Relative distributed formation and obstacle avoidance with multi-agent reinforcement learning,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 1661–1667.
- Z. Liu, J. Li, J. Shen, X. Wang, and P. Chen, “Leader–follower uavs formation control based on a deep q-network collaborative framework,” Scientific Reports, vol. 14, no. 1, p. 4674, 2024.
- A. Khan, E. Tolstaya, A. Ribeiro, and V. Kumar, “Graph policy gradients for large scale robot control,” in Conference on robot learning. PMLR, 2020, pp. 823–834.
- T. A. Karagüzel, V. Retamal, and E. Ferrante, “Onboard controller design for nano uav swarm in operator-guided collective behaviors,” in 2023 IEEE International Conference on Robotics and Automation (ICRA). IEEE, 2023, pp. 3268–3274.
- S. Batra, Z. Huang, A. Petrenko, T. Kumar, A. Molchanov, and G. S. Sukhatme, “Decentralized control of quadrotor swarms with end-to-end deep reinforcement learning,” in Conference on Robot Learning. PMLR, 2022, pp. 576–586.
- X. Han, J. Wang, Q. Zhang, X. Qin, and M. Sun, “Multi-uav automatic dynamic obstacle avoidance with experience-shared a2c,” in 2019 International Conference on Wireless and Mobile Computing, Networking and Communications (WiMob). IEEE, 2019, pp. 330–335.
- J. Wang, J. Cao, M. Stojmenovic, M. Zhao, J. Chen, and S. Jiang, “Pattern-rl: Multi-robot cooperative pattern formation via deep reinforcement learning,” in 2019 18th IEEE International Conference On Machine Learning And Applications (ICMLA). IEEE, 2019, pp. 210–215.
- S. Zhao, “Affine formation maneuver control of multiagent systems,” IEEE Transactions on Automatic Control, vol. 63, no. 12, pp. 4140–4155, 2018.
- Z. Han, L. Wang, and Z. Lin, “Local formation control strategies with undetermined and determined formation scales for co-leader vehicle networks,” in 52nd IEEE Conference on Decision and Control. IEEE, 2013, pp. 7339–7344.
- L. Quan, L. Yin, C. Xu, and F. Gao, “Distributed swarm trajectory optimization for formation flight in dense environments,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 4979–4985.
- A. T. Nguyen, J.-W. Lee, T. B. Nguyen, and S. K. Hong, “Collision-free formation control of multiple nano-quadrotors,” arXiv preprint arXiv:2107.13203, 2021.
- Y. Zhou, L. Quan, C. Xu, G. Xu, and F. Gao, “Sparse-graph-enabled formation planning for large-scale aerial swarms,” arXiv preprint arXiv:2403.17288, 2024.
- J. Alonso-Mora, E. Montijano, M. Schwager, and D. Rus, “Distributed multi-robot formation control among obstacles: A geometric and optimization approach with consensus,” in 2016 IEEE international conference on robotics and automation (ICRA). IEEE, 2016, pp. 5356–5363.
- J. Tordesillas and J. P. How, “Panther: Perception-aware trajectory planner in dynamic environments,” IEEE Access, vol. 10, pp. 22 662–22 677, 2022.
- ——, “Deep-panther: Learning-based perception-aware trajectory planner in dynamic environments,” IEEE Robotics and Automation Letters, vol. 8, no. 3, pp. 1399–1406, 2023.
- X. Zhou, J. Zhu, H. Zhou, C. Xu, and F. Gao, “Ego-swarm: A fully autonomous and decentralized quadrotor swarm system in cluttered environments,” in 2021 IEEE international conference on robotics and automation (ICRA). IEEE, 2021, pp. 4101–4107.
- X. Zhou, X. Wen, Z. Wang, Y. Gao, H. Li, Q. Wang, T. Yang, H. Lu, Y. Cao, C. Xu, et al., “Swarm of micro flying robots in the wild,” Science Robotics, vol. 7, no. 66, p. eabm5954, 2022.
- C. Toumieh and D. Floreano, “High-speed motion planning for aerial swarms in unknown and cluttered environments,” arXiv preprint arXiv:2402.19033, 2024.
- G. Vásárhelyi, C. Virágh, G. Somorjai, T. Nepusz, A. E. Eiben, and T. Vicsek, “Optimized flocking of autonomous drones in confined environments,” Science Robotics, vol. 3, no. 20, p. eaat3536, 2018.
- R. J. Amala Arokia Nathan, I. Kurmi, and O. Bimber, “Drone swarm strategy for the detection and tracking of occluded targets in complex environments,” Communications Engineering, vol. 2, no. 1, p. 55, 2023.
- D. Wang, T. Fan, T. Han, and J. Pan, “A two-stage reinforcement learning approach for multi-uav collision avoidance under imperfect sensing,” IEEE Robotics and Automation Letters, vol. 5, no. 2, pp. 3098–3105, 2020.
- J. Alonso-Mora, T. Naegeli, R. Siegwart, and P. Beardsley, “Collision avoidance for aerial vehicles in multi-agent scenarios,” Autonomous Robots, vol. 39, pp. 101–121, 2015.
- J. Tordesillas and J. P. How, “Mader: Trajectory planner in multiagent and dynamic environments,” IEEE Transactions on Robotics, vol. 38, no. 1, pp. 463–476, 2021.
- K. Kondo, R. Figueroa, J. Rached, J. Tordesillas, P. C. Lusk, and J. P. How, “Robust mader: Decentralized multiagent trajectory planner robust to communication delay in dynamic environments,” IEEE Robotics and Automation Letters, 2023.
- K. Kondo, C. T. Tewari, M. B. Peterson, A. Thomas, J. Kinnari, A. Tagliabue, and J. P. How, “Puma: Fully decentralized uncertainty-aware multiagent trajectory planner with real-time image segmentation-based frame alignment,” arXiv preprint arXiv:2311.03655, 2023.
- B. B. Şenbaşlar, “Decentralized real-time trajectory planning for multi-robot navigation in cluttered environments,” Ph.D. dissertation, University of Southern California, 2023.
- B. Şenbaşlar, P. Luiz, W. Hönig, and G. S. Sukhatme, “Mrnav: Multi-robot aware planning and control stack for collision and deadlock-free navigation in cluttered environments,” arXiv preprint arXiv:2308.13499, 2023.
- Z. Huang, Z. Yang, R. Krupani, B. Şenbaşlar, S. Batra, and G. S. Sukhatme, “Collision avoidance and navigation for a quadrotor swarm using end-to-end deep reinforcement learning,” arXiv preprint arXiv:2309.13285, 2023.
- E. Kaufmann, L. Bauersfeld, and D. Scaramuzza, “A benchmark comparison of learned control policies for agile quadrotor flight,” in 2022 International Conference on Robotics and Automation (ICRA). IEEE, 2022, pp. 10 504–10 510.
- C. Yu, A. Velu, E. Vinitsky, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative, multi-agent games. arxiv 2021,” arXiv preprint arXiv:2103.01955.
- L. Quan, L. Yin, T. Zhang, M. Wang, R. Wang, S. Zhong, X. Zhou, Y. Cao, C. Xu, and F. Gao, “Robust and efficient trajectory planning for formation flight in dense environments,” IEEE Transactions on Robotics, 2023.
- B. Xu, F. Gao, C. Yu, R. Zhang, Y. Wu, and Y. Wang, “Omnidrones: An efficient and flexible platform for reinforcement learning in drone control,” IEEE Robotics and Automation Letters, 2024.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.