Decentralized Multi-Robot Formation Control Using Reinforcement Learning (2306.14489v1)
Abstract: This paper presents a decentralized leader-follower multi-robot formation control based on a reinforcement learning (RL) algorithm applied to a swarm of small educational Sphero robots. Since the basic Q-learning method is known to require large memory resources for Q-tables, this work implements the Double Deep Q-Network (DDQN) algorithm, which has achieved excellent results in many robotic problems. To enhance the system behavior, we trained two different DDQN models, one for reaching the formation and the other for maintaining it. The models use a discrete set of robot motions (actions) to adapt the continuous nonlinear system to the discrete nature of RL. The presented approach has been tested in simulation and real experiments which show that the multi-robot system can achieve and maintain a stable formation without the need for complex mathematical models and nonlinear control laws.
- J. Alonso-Mora, S. Baker, and D. Rus, “Multi-robot formation control and object transport in dynamic environments via constrained optimization,” The International Journal of Robotics Research, vol. 36, no. 9, pp. 1000–1021, Aug. 2017. [Online]. Available: https://doi.org/10.1177/0278364917719333
- M. Saska, V. Kratky, V. Spurny, and T. Baca, “Documentation of dark areas of large historical buildings by a formation of unmanned aerial vehicles using model predictive control,” in 2017 22nd IEEE International Conference on Emerging Technologies and Factory Automation (ETFA). IEEE, Sept. 2017. [Online]. Available: https://doi.org/10.1109/etfa.2017.8247654
- R. D. Arnold, H. Yamaguchi, and T. Tanaka, “Search and rescue with autonomous flying robots through behavior-based cooperative intelligence,” Journal of International Humanitarian Action, vol. 3, no. 1, Dec. 2018. [Online]. Available: https://doi.org/10.1186/s41018-018-0045-4
- A. Baranzadeh and A. V. Savkin, “A distributed control algorithm for area search by a multi-robot team,” Robotica, vol. 35, no. 6, p. 1452–1472, 2017.
- D. Scharf, F. Hadaegh, and S. Ploen, “A survey of spacecraft formation flying guidance and control. part ii: control,” in Proceedings of the 2004 American Control Conference, vol. 4, 2004, pp. 2976–2985 vol.4.
- P. K. C. Wang, F. Y. Hadaegh, and K. Lau, “Synchronized formation rotation and attitude control of multiple free-flying spacecraft,” Journal of Guidance, Control, and Dynamics, vol. 22, no. 1, pp. 28–35, Jan. 1999. [Online]. Available: https://doi.org/10.2514/2.4367
- H. Su, X. Wang, and Z. Lin, “Flocking of multi-agents with a virtual leader,” IEEE Transactions on Automatic Control, vol. 54, no. 2, pp. 293–307, Feb. 2009. [Online]. Available: https://doi.org/10.1109/tac.2008.2010897
- T. Balch and R. Arkin, “Behavior-based formation control for multirobot teams,” IEEE Transactions on Robotics and Automation, vol. 14, no. 6, pp. 926–939, 1998.
- C. W. Reynolds, “Flocks, herds and schools: A distributed behavioral model,” in Proceedings of the 14th annual conference on Computer graphics and interactive techniques - SIGGRAPH ’87. ACM Press, 1987. [Online]. Available: https://www.red3d.com/cwr/papers/1987/boids.html
- C. B. Low and Q. S. Ng, “A flexible virtual structure formation keeping control for fixed-wing uavs,” in 2011 9th IEEE International Conference on Control and Automation (ICCA), 2011, pp. 621–626.
- D. Miklić, S. Bogdan, R. Fierro, and Y. Song, “A grid-based approach to formation reconfiguration for a class of robots with non-holonomic constraints,” European journal of control, vol. 18, no. 2, pp. 162–181, 2012. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0947358012705423
- Z. Lin, L. Wang, Z. Han, and M. Fu, “Distributed formation control of multi-agent systems using complex laplacian,” IEEE Transactions on Automatic Control, vol. 59, no. 7, pp. 1765–1777, July 2014. [Online]. Available: https://doi.org/10.1109/tac.2014.2309031
- M. Knopp, C. Aykin, J. Feldmaier, and H. Shen, “Formation control using GQ(λ𝜆\lambdaitalic_λ) reinforcement learning,” in 2017 26th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN). IEEE, Aug. 2017. [Online]. Available: https://doi.org/10.1109/roman.2017.8172432
- P. Sadhukhan and R. R. Selmic, “Multi-agent formation control with obstacle avoidance using proximal policy optimization,” in 2021 IEEE International Conference on Systems, Man, and Cybernetics (SMC). IEEE, Oct. 2021. [Online]. Available: https://doi.org/10.1109/smc52423.2021.9658635
- Y. Zhou, F. Lu, G. Pu, X. Ma, R. Sun, H.-Y. Chen, and X. Li, “Adaptive leader-follower formation control and obstacle avoidance via deep reinforcement learning,” in 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). IEEE, Nov. 2019. [Online]. Available: https://doi.org/10.1109/iros40897.2019.8967561
- Z. He, L. Dong, C. Sun, and J. Wang, “Reinforcement learning based multi-robot formation control under separation bearing orientation scheme,” in 2020 Chinese Automation Congress (CAC). IEEE, Nov. 2020. [Online]. Available: https://doi.org/10.1109/cac51589.2020.9327315
- J. Xie, R. Zhou, Y. Liu, J. Luo, S. Xie, Y. Peng, and H. Pu, “Reinforcement-learning-based asynchronous formation control scheme for multiple unmanned surface vehicles,” Applied Sciences, vol. 11, no. 2, p. 546, Jan. 2021. [Online]. Available: https://doi.org/10.3390/app11020546
- K. Wu, H. Wang, M. A. Esfahani, and S. Yuan, “Bnd*-ddqn: Learn to steer autonomously through deep reinforcement learning,” IEEE Transactions on Cognitive and Developmental Systems, pp. 1–1, 2019.
- X. Xue, Z. Li, D. Zhang, and Y. Yan, “A deep reinforcement learning method for mobile robot collision avoidance based on double DQN,” in 2019 IEEE 28th International Symposium on Industrial Electronics (ISIE). IEEE, June 2019. [Online]. Available: https://doi.org/10.1109/isie.2019.8781522
- Z. Sui, Z. Pu, J. Yi, and X. Tan, “Path planning of multiagent constrained formation through deep reinforcement learning,” in 2018 International Joint Conference on Neural Networks (IJCNN). IEEE, July 2018. [Online]. Available: https://doi.org/10.1109/ijcnn.2018.8489066
- C. J. C. H. Watkins and P. Dayan, “Q-learning,” Machine Learning, vol. 8, no. 3-4, pp. 279–292, May 1992. [Online]. Available: https://doi.org/10.1007/bf00992698
- V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski, S. Petersen, C. Beattie, A. Sadik, I. Antonoglou, H. King, D. Kumaran, D. Wierstra, S. Legg, and D. Hassabis, “Human-level control through deep reinforcement learning,” Nature, vol. 518, no. 7540, pp. 529–533, Feb. 2015. [Online]. Available: https://doi.org/10.1038/nature14236
- H. van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double q-learning,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30, no. 1, Mar. 2016. [Online]. Available: https://ojs.aaai.org/index.php/AAAI/article/view/10295
- Sphero, Inc., “Programmable Robot Ball: Sphero SPRK+ — Teach STEM with Sphero,” Accessed Feb. 23 2022 [Online]. [Online]. Available: https://sphero.com/products/sphero-sprk-plus
- N. Koenig and A. Howard, “Design and use paradigms for gazebo, an open-source multi-robot simulator,” in 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), vol. 3, 2004, pp. 2149–2154.
- M. Quigley, B. Gerkey, K. Conley, J. Faust, T. Foote, J. Leibs, E. Berger, R. Wheeler, and A. Ng, “Ros: an open-source robot operating system,” in Proc. of the IEEE Intl. Conf. on Robotics and Automation (ICRA) Workshop on Open Source Robotics, Kobe, Japan, may 2009.
- G. Bradski, “The OpenCV Library,” Dr. Dobb’s Journal of Software Tools, 2000.
- A. Bewley, Z. Ge, L. Ott, F. Ramos, and B. Upcroft, “Simple online and realtime tracking,” in 2016 IEEE International Conference on Image Processing (ICIP), 2016, pp. 3464–3468.