Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MASP: Scalable GNN-based Planning for Multi-Agent Navigation (2312.02522v2)

Published 5 Dec 2023 in cs.LG, cs.AI, and cs.RO

Abstract: We investigate multi-agent navigation tasks, where multiple agents need to reach initially unassigned goals in a limited time. Classical planning-based methods suffer from expensive computation overhead at each step and offer limited expressiveness for complex cooperation strategies. In contrast, reinforcement learning (RL) has recently become a popular approach for addressing this issue. However, RL struggles with low data efficiency and cooperation when directly exploring (nearly) optimal policies in a large exploration space, especially with an increased number of agents(e.g., 10+ agents) or in complex environments (e.g., 3-D simulators). In this paper, we propose the Multi-Agent Scalable Graph-based Planner (MASP), a goal-conditioned hierarchical planner for navigation tasks with a substantial number of agents in the decentralized setting. MASP employs a hierarchical framework to reduce space complexity by decomposing a large exploration space into multiple goal-conditioned subspaces, where a high-level policy assigns agents goals, and a low-level policy navigates agents toward designated goals. For agent cooperation and the adaptation to varying team sizes, we model agents and goals as graphs to better capture their relationship. The high-level policy, the Goal Matcher, leverages a graph-based Self-Encoder and Cross-Encoder to optimize goal assignment by updating the agent and the goal graphs. The low-level policy, the Coordinated Action Executor, introduces the Group Information Fusion to facilitate group division and extract agent relationships across groups, enhancing training efficiency for agent cooperation. The results demonstrate that MASP outperforms RL and planning-based baselines in task efficiency.

Introduction

Within the field of multi-agent systems, efficiently navigating autonomous agents toward specific goals, particularly in contexts where multiple entities operate independently, presents an intricate challenge. Classical methods, based in planning, come with their limitations when it comes to computation overhead and flexibility. Reinforcement learning (RL) alternatives offer promise in this area, providing robust representation capabilities; however, these models encounter difficulties with data efficiency and cooperation.

Hierarchical Framework and GNN

The Multi-Agent Scalable GNN-based Planner (MASP) is built around a hierarchical framework that effectively reduces the high-dimensional search space involved in navigation tasks through its division into smaller manageable regions. This structure significantly accelerates the convergence of training and boosts data efficiency. To better facilitate cooperation and goal attainment among agents, MASP integrates Graph Neural Networks (GNN), which enable a deep understanding of the inter-agent relationships and interactions with goals.

MASP is comprised of two key components:

  1. Multi-Goal Matcher (MGM): It employs a decentralized graph matching strategy that assigns the most appropriate goals to agents at each global step.
  2. Coordinated Action Executor (CAE): With a Graph Merger and Goal Encoder, this component captures the essential correlation between agents and their assigned goals, promoting synergistic cooperation.

Experimental Performance

Empirically, MASP demonstrates superior performance compared to existing planning-based methods and RL competitors. In environments like MPE and Omnidrones that accommodate large groups of agents, MASP achieves nearly perfect success rates with minimal steps taken. Notably, in challenging 3D simulations involving up to 20 agents, MASP displays striking generalization abilities, as it performs effectively even in scenarios composed of unseen team sizes.

Conclusion

MASP validates its efficiency in establishing cooperative strategies and its adaptability to complex and dynamic environmental conditions. It does so while also demonstrating strong generalization capabilities and impressive performance in scenarios with large numbers of agents. This makes MASP a compelling approach for decentralized multi-agent navigation tasks and opens avenues for broader applications in multi-agent systems.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (38)
  1. G. Bresson, Z. Alsayed, L. Yu, and S. Glaser, “Simultaneous localization and mapping: A survey of current trends in autonomous driving,” IEEE Transactions on Intelligent Vehicles, vol. 2, no. 3, pp. 194–220, 2017.
  2. S. Grigorescu, B. Trasnea, T. Cocias, and G. Macesanu, “A survey of deep learning techniques for autonomous driving,” Journal of Field Robotics, vol. 37, no. 3, pp. 362–386, 2020.
  3. S. Liu and L. Hu, “Application of beidou navigation satellite system in logistics and transportation,” in Logistics: The Emerging Frontiers of Transportation and Development in China, 2009, pp. 1789–1794.
  4. K. Gao, J. Xin, H. Cheng, D. Liu, and J. Li, “Multi-mobile robot autonomous navigation system for intelligent logistics,” in 2018 Chinese Automation Congress (CAC).   IEEE, 2018, pp. 2603–2609.
  5. A. Kleiner, J. Prediger, and B. Nebel, “Rfid technology-based exploration and slam for search and rescue,” in 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2006, pp. 4054–4059.
  6. D. Calisi, A. Farinelli, L. Iocchi, and D. Nardi, “Autonomous navigation and exploration in a rescue environment,” in IEEE International Safety, Security and Rescue Rototics, Workshop, 2005.   IEEE, 2005, pp. 54–59.
  7. W. Burgard, M. Moors, C. Stachniss, and F. E. Schneider, “Coordinated multi-robot exploration,” IEEE Transactions on robotics, vol. 21, no. 3, pp. 376–386, 2005.
  8. H. Umari and S. Mukhopadhyay, “Autonomous robotic exploration based on multiple rapidly-exploring randomized trees,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2017, pp. 1396–1402.
  9. C. Yu, X. Yang, J. Gao, H. Yang, Y. Wang, and Y. Wu, “Learning efficient multi-agent cooperative visual exploration,” in European Conference on Computer Vision.   Springer, 2022, pp. 497–515.
  10. X. Yang, S. Huang, Y. Sun, Y. Yang, C. Yu, W.-W. Tu, H. Yang, and Y. Wang, “Learning graph-enhanced commander-executor for multi-agent navigation,” in Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023, pp. 1652–1660.
  11. L. Xia, C. Yu, and Z. Wu, “Inference-based hierarchical reinforcement learning for cooperative multi-agent navigation,” in 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI).   IEEE, 2021, pp. 57–64.
  12. C. Yu, A. Velu, E. Vinitsky, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative, multi-agent games,” arXiv preprint arXiv:2103.01955, 2021.
  13. M. Wen, J. G. Kuba, R. Lin, W. Zhang, Y. Wen, J. Wang, and Y. Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” arXiv preprint arXiv:2205.14953, 2022.
  14. C. Wakilpoor, P. J. Martin, C. Rebhuhn, and A. Vu, “Heterogeneous multi-agent reinforcement learning for unknown environment mapping,” arXiv preprint arXiv:2010.02663, 2020.
  15. X. Liu, D. Guo, H. Liu, and F. Sun, “Multi-agent embodied visual semantic navigation with scene prior knowledge,” arXiv preprint arXiv:2109.09531, 2021.
  16. R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in neural information processing systems, vol. 30, 2017.
  17. B. Xu, F. Gao, C. Yu, R. Zhang, Y. Wu, and Y. Wang, “Omnidrones: An efficient and flexible platform for reinforcement learning in drone control,” arXiv preprint arXiv:2309.12825, 2023.
  18. S. Zhang, Y. Li, and Q. Dong, “Autonomous navigation of uav in multi-obstacle environments based on a deep reinforcement learning approach,” Applied Soft Computing, vol. 115, p. 108194, 2022.
  19. M. Martini, S. Cerrato, F. Salvetti, S. Angarano, and M. Chiaberge, “Position-agnostic autonomous navigation in vineyards with deep reinforcement learning,” in 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE).   IEEE, 2022, pp. 477–484.
  20. X. Yang, C. Yu, J. Gao, Y. Wang, and H. Yang, “Save: Spatial-attention visual exploration,” in 2022 IEEE International Conference on Image Processing (ICIP).   IEEE, 2022, pp. 1356–1360.
  21. R. Han, S. Chen, S. Wang, Z. Zhang, R. Gao, Q. Hao, and J. Pan, “Reinforcement learned distributed multi-robot navigation with reciprocal velocity obstacle shaped rewards,” IEEE Robotics and Automation Letters, vol. 7, no. 3, pp. 5896–5903, 2022.
  22. B. Wang, J. Xie, and N. Atanasov, “Darl1n: Distributed multi-agent reinforcement learning with one-hop neighbors,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 9003–9010.
  23. D. Hafner, K.-H. Lee, I. Fischer, and P. Abbeel, “Deep hierarchical planning from pixels,” Advances in Neural Information Processing Systems, vol. 35, pp. 26 091–26 104, 2022.
  24. Y. Takubo, H. Chen, and K. Ho, “Hierarchical reinforcement learning framework for stochastic spaceflight campaign design,” Journal of Spacecraft and rockets, vol. 59, no. 2, pp. 421–433, 2022.
  25. S. Nasiriany, H. Liu, and Y. Zhu, “Augmenting reinforcement learning with behavior primitives for diverse manipulation tasks,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 7477–7484.
  26. R. Wang, R. Yu, B. An, and Z. Rabinovich, “I2hrl: Interactive influence-based hierarchical reinforcement learning,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 3131–3138.
  27. A. P. Pope, J. S. Ide, D. Mićović, H. Diaz, D. Rosenbluth, L. Ritholtz, J. C. Twedt, T. T. Walker, K. Alcedo, and D. Javorsek, “Hierarchical reinforcement learning for air-to-air combat,” in 2021 international conference on unmanned aircraft systems (ICUAS).   IEEE, 2021, pp. 275–284.
  28. F. Scarselli, M. Gori, A. C. Tsoi, M. Hagenbuchner, and G. Monfardini, “The graph neural network model,” IEEE transactions on neural networks, vol. 20, no. 1, pp. 61–80, 2008.
  29. Y. Liu, K. Zeng, H. Wang, X. Song, and B. Zhou, “Content matters: A gnn-based model combined with text semantics for social network cascade prediction,” in Pacific-Asia Conference on Knowledge Discovery and Data Mining.   Springer, 2021, pp. 728–740.
  30. S. Zhang, Y. Liu, and L. Xie, “Molecular mechanics-driven graph neural network with multiplex graph for molecular structures,” arXiv preprint arXiv:2011.07457, 2020.
  31. Y. Liu, W. Wang, Y. Hu, J. Hao, X. Chen, and Y. Gao, “Multi-agent game abstraction via graph attention neural network,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, no. 05, 2020, pp. 7211–7218.
  32. X. Jia, L. Sun, H. Zhao, M. Tomizuka, and W. Zhan, “Multi-agent trajectory prediction by combining egocentric and allocentric views,” in Conference on Robot Learning.   PMLR, 2022, pp. 1434–1443.
  33. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  34. M. Turpin, K. Mohta, N. Michael, and V. Kumar, “Goal assignment and trajectory planning for large teams of interchangeable robots,” Autonomous Robots, vol. 37, no. 4, pp. 401–415, 2014.
  35. B. P. Gerkey and M. J. Matarić, “A formal analysis and taxonomy of task allocation in multi-robot systems,” The International journal of robotics research, vol. 23, no. 9, pp. 939–954, 2004.
  36. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  37. K. Guo, D. Wang, T. Fan, and J. Pan, “Vr-orca: Variable responsibility optimal reciprocal collision avoidance,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 4520–4527, 2021.
  38. J. Hu, H. Niu, J. Carrasco, B. Lennox, and F. Arvin, “Voronoi-based multi-robot autonomous exploration in unknown environments via deep reinforcement learning,” IEEE Transactions on Vehicular Technology, vol. 69, no. 12, pp. 14 413–14 423, 2020.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Xinyi Yang (33 papers)
  2. Xinting Yang (3 papers)
  3. Chao Yu (116 papers)
  4. Jiayu Chen (51 papers)
  5. Huazhong Yang (80 papers)
  6. Yu Wang (939 papers)
  7. Wenbo Ding (53 papers)