Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-Robot Cooperative Socially-Aware Navigation Using Multi-Agent Reinforcement Learning (2309.15234v2)

Published 26 Sep 2023 in cs.RO

Abstract: In public spaces shared with humans, ensuring multi-robot systems navigate without collisions while respecting social norms is challenging, particularly with limited communication. Although current robot social navigation techniques leverage advances in reinforcement learning and deep learning, they frequently overlook robot dynamics in simulations, leading to a simulation-to-reality gap. In this paper, we bridge this gap by presenting a new multi-robot social navigation environment crafted using Dec-POSMDP and multi-agent reinforcement learning. Furthermore, we introduce SAMARL: a novel benchmark for cooperative multi-robot social navigation. SAMARL employs a unique spatial-temporal transformer combined with multi-agent reinforcement learning. This approach effectively captures the complex interactions between robots and humans, thus promoting cooperative tendencies in multi-robot systems. Our extensive experiments reveal that SAMARL outperforms existing baseline and ablation models in our designed environment. Demo videos for this work can be found at: https://sites.google.com/view/samarl

Definition Search Book Streamline Icon: https://streamlinehq.com
References (49)
  1. A. Camisa, A. Testa, and G. Notarstefano, “Multi-robot pickup and delivery via distributed resource allocation,” IEEE Transactions on Robotics, vol. 39, no. 2, pp. 1106–1118, 2022.
  2. C. Nieto-Granda, J. G. Rogers III, and H. I. Christensen, “Coordination strategies for multi-robot exploration and mapping,” The International Journal of Robotics Research, vol. 33, no. 4, pp. 519–533, 2014.
  3. T. Fan, P. Long, W. Liu, and J. Pan, “Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios,” The International Journal of Robotics Research, vol. 39, no. 7, pp. 856–892, 2020.
  4. H. Kretzschmar, M. Spies, C. Sprunk, and W. Burgard, “Socially compliant mobile robot navigation via inverse reinforcement learning,” The International Journal of Robotics Research, vol. 35, no. 11, pp. 1289–1307, 2016.
  5. R. Wang, W. Wang, and B.-C. Min, “Feedback-efficient active preference learning for socially aware robot navigation,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 11 336–11 343.
  6. S. Liu, P. Chang, Z. Huang, N. Chakraborty, K. Hong, W. Liang, D. L. McPherson, J. Geng, and K. Driggs-Campbell, “Intention aware robot crowd navigation with attention-based interaction graph,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 12 015–12 021.
  7. C. I. Mavrogiannis and R. A. Knepper, “Multi-agent path topology in support of socially competent navigation planning,” The International Journal of Robotics Research, vol. 38, no. 2-3, pp. 338–356, 2019.
  8. C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh, “Core challenges of social robot navigation: A survey,” ACM Transactions on Human-Robot Interaction, vol. 12, no. 3, pp. 1–39, 2023.
  9. A. Gasparri, L. Sabattini, and G. Ulivi, “Bounded control law for global connectivity maintenance in cooperative multirobot systems,” IEEE Transactions on Robotics, vol. 33, no. 3, pp. 700–717, 2017.
  10. Y. F. Chen, M. Liu, M. Everett, and J. P. How, “Decentralized non-communicating multiagent collision avoidance with deep reinforcement learning,” in 2017 IEEE international conference on robotics and automation (ICRA).   IEEE, 2017, pp. 285–292.
  11. A. Kadian, J. Truong, A. Gokaslan, A. Clegg, E. Wijmans, S. Lee, M. Savva, S. Chernova, and D. Batra, “Sim2real predictivity: Does evaluation in simulation predict real-world performance?” IEEE Robotics and Automation Letters, vol. 5, no. 4, pp. 6670–6677, 2020.
  12. S. Omidshafiei, A.-a. Agha-mohammadi, C. Amato, and J. P. How, “Decentralized control of partially observable markov decision processes using belief space macro-actions,” in 2015 IEEE International Conference on Robotics and Automation (ICRA), 2015, pp. 5962–5969.
  13. C. Cao, P. Trautman, and S. Iba, “Dynamic channel: A planning framework for crowd navigation,” in 2019 international conference on robotics and automation (ICRA).   IEEE, 2019, pp. 5551–5557.
  14. N. E. Du Toit and J. W. Burdick, “Robot motion planning in dynamic, uncertain environments,” IEEE Transactions on Robotics, vol. 28, no. 1, pp. 101–115, 2011.
  15. P. Trautman, J. Ma, R. M. Murray, and A. Krause, “Robot navigation in dense human crowds: Statistical models and experimental studies of human–robot cooperation,” The International Journal of Robotics Research, vol. 34, no. 3, pp. 335–356, 2015.
  16. J. K. Johnson, “The colliding reciprocal dance problem: A mitigation strategy with application to automotive active safety systems,” in 2020 American Control Conference (ACC).   IEEE, 2020, pp. 1417–1422.
  17. W. Wang, R. Wang, L. Mao, and B.-C. Min, “Navistar: Socially aware robot navigation with hybrid spatio-temporal graph transformer and preference learning,” arXiv preprint arXiv:2304.05979, 2023.
  18. C. Yu, A. Velu, E. Vinitsky, J. Gao, Y. Wang, A. Bayen, and Y. Wu, “The surprising effectiveness of ppo in cooperative multi-agent games,” Advances in Neural Information Processing Systems, vol. 35, pp. 24 611–24 624, 2022.
  19. G. F. Cooper, “The computational complexity of probabilistic inference using bayesian belief networks,” Artificial intelligence, vol. 42, no. 2-3, pp. 393–405, 1990.
  20. A. Turnwald and D. Wollherr, “Human-like motion planning based on game theoretic decision making,” International Journal of Social Robotics, vol. 11, pp. 151–170, 2019.
  21. S. Liu, P. Chang, W. Liang, N. Chakraborty, and K. Driggs-Campbell, “Decentralized structural-rnn for robot crowd navigation with deep reinforcement learning,” in 2021 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2021, pp. 3517–3524.
  22. M. Sun, F. Baldini, P. Trautman, and T. Murphey, “Move Beyond Trajectories: Distribution Space Coupling for Crowd Navigation,” in Proceedings of Robotics: Science and Systems, Virtual, July 2021.
  23. J. Schulman, F. Wolski, P. Dhariwal, A. Radford, and O. Klimov, “Proximal policy optimization algorithms,” arXiv preprint arXiv:1707.06347, 2017.
  24. R. Han, S. Chen, and Q. Hao, “Cooperative multi-robot navigation in dynamic environment with deep reinforcement learning,” in 2020 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2020, pp. 448–454.
  25. P. Long, T. Fan, X. Liao, W. Liu, H. Zhang, and J. Pan, “Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning,” in 2018 IEEE international conference on robotics and automation (ICRA).   IEEE, 2018, pp. 6252–6259.
  26. R. Wang, D. Zhao, and B.-C. Min, “Initial task allocation for multi-human multi-robot teams with attention-based deep reinforcement learning,” arXiv preprint arXiv:2303.02486, 2023.
  27. A. Tampuu, T. Matiisen, D. Kodelja, I. Kuzovkin, K. Korjus, J. Aru, J. Aru, and R. Vicente, “Multiagent cooperation and competition with deep reinforcement learning,” PloS one, vol. 12, no. 4, p. e0172395, 2017.
  28. J. G. Kuba, R. Chen, M. Wen, Y. Wen, F. Sun, J. Wang, and Y. Yang, “Trust region policy optimisation in multi-agent reinforcement learning,” in International Conference on Learning Representations, 2022. [Online]. Available: https://openreview.net/forum?id=EcGGFkNTxdJ
  29. R. Lowe, Y. I. Wu, A. Tamar, J. Harb, O. Pieter Abbeel, and I. Mordatch, “Multi-agent actor-critic for mixed cooperative-competitive environments,” Advances in neural information processing systems, vol. 30, 2017.
  30. L. Chen, K. Lu, A. Rajeswaran, K. Lee, A. Grover, M. Laskin, P. Abbeel, A. Srinivas, and I. Mordatch, “Decision transformer: Reinforcement learning via sequence modeling,” Advances in neural information processing systems, vol. 34, pp. 15 084–15 097, 2021.
  31. M. Wen, J. Kuba, R. Lin, W. Zhang, Y. Wen, J. Wang, and Y. Yang, “Multi-agent reinforcement learning is a sequence modeling problem,” Advances in Neural Information Processing Systems, vol. 35, pp. 16 509–16 521, 2022.
  32. L. Meng, M. Wen, C. Le, X. Li, D. Xing, W. Zhang, Y. Wen, H. Zhang, J. Wang, Y. Yang et al., “Offline pre-trained multi-agent decision transformer,” Machine Intelligence Research, vol. 20, no. 2, pp. 233–248, 2023.
  33. C. Chen, S. Hu, P. Nikdel, G. Mori, and M. Savva, “Relational graph learning for crowd navigation,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 10 007–10 013.
  34. C. Chen, Y. Liu, S. Kreiss, and A. Alahi, “Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning,” in 2019 international conference on robotics and automation (ICRA).   IEEE, 2019, pp. 6015–6022.
  35. Z. Hendzel et al., “Modelling of dynamics of a wheeled mobile robot with mecanum wheels with the use of lagrange equations of the second kind,” International Journal of Applied Mechanics and Engineering, vol. 22, no. 1, pp. 81–99, 2017.
  36. D. S. Bernstein, R. Givan, N. Immerman, and S. Zilberstein, “The complexity of decentralized control of markov decision processes,” Mathematics of operations research, vol. 27, no. 4, pp. 819–840, 2002.
  37. A. Loquercio, E. Kaufmann, R. Ranftl, M. Müller, V. Koltun, and D. Scaramuzza, “Learning high-speed flight in the wild,” Science Robotics, vol. 6, no. 59, p. eabg5810, 2021.
  38. Y. Liu, Q. Yan, and A. Alahi, “Social nce: Contrastive learning of socially-aware motion representations,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 15 118–15 129.
  39. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
  40. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in International Conference on Learning Representations, 2017.
  41. Y.-H. H. Tsai, S. Bai, P. P. Liang, J. Z. Kolter, L.-P. Morency, and R. Salakhutdinov, “Multimodal transformer for unaligned multimodal language sequences,” in Proceedings of the conference. Association for Computational Linguistics. Meeting, vol. 2019.   NIH Public Access, 2019, p. 6558.
  42. R. Wang, W. Jo, D. Zhao, W. Wang, B. Yang, G. Chen, and B.-C. Min, “Husformer: A multi-modal transformer for multi-modal human state recognition,” arXiv preprint arXiv:2209.15182, 2022.
  43. J. Schulman, P. Moritz, S. Levine, M. Jordan, and P. Abbeel, “High-dimensional continuous control using generalized advantage estimation,” International Conference on Learning Representations (ICLR), 2016.
  44. J. Rios-Martinez, A. Spalanzani, and C. Laugier, “From proxemics theory to socially-aware navigation: A survey,” International Journal of Social Robotics, vol. 7, pp. 137–153, 2015.
  45. J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research: The 14th International Symposium ISRR.   Springer, 2011, pp. 3–19.
  46. “Yolov8.” [Online]. Available: https://docs.ultralytics.com/
  47. N. Wojke, A. Bewley, and D. Paulus, “Simple online and realtime tracking with a deep association metric,” in 2017 IEEE international conference on image processing (ICIP).   IEEE, 2017, pp. 3645–3649.
  48. A. Pramanik, S. K. Pal, J. Maiti, and P. Mitra, “Granulated rcnn and multi-class deep sort for multi-object detection and tracking,” IEEE Transactions on Emerging Topics in Computational Intelligence, vol. 6, no. 1, pp. 171–181, 2021.
  49. L. Bertoni, S. Kreiss, and A. Alahi, “Monoloco: Monocular 3d pedestrian localization and uncertainty estimation,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019, pp. 6861–6871.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Weizheng Wang (17 papers)
  2. Le Mao (6 papers)
  3. Ruiqi Wang (62 papers)
  4. Byung-Cheol Min (53 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.