Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Study on Learning Social Robot Navigation with Multimodal Perception (2309.12568v1)

Published 22 Sep 2023 in cs.RO and cs.AI

Abstract: Autonomous mobile robots need to perceive the environments with their onboard sensors (e.g., LiDARs and RGB cameras) and then make appropriate navigation decisions. In order to navigate human-inhabited public spaces, such a navigation task becomes more than only obstacle avoidance, but also requires considering surrounding humans and their intentions to somewhat change the navigation behavior in response to the underlying social norms, i.e., being socially compliant. Machine learning methods are shown to be effective in capturing those complex and subtle social interactions in a data-driven manner, without explicitly hand-crafting simplified models or cost functions. Considering multiple available sensor modalities and the efficiency of learning methods, this paper presents a comprehensive study on learning social robot navigation with multimodal perception using a large-scale real-world dataset. The study investigates social robot navigation decision making on both the global and local planning levels and contrasts unimodal and multimodal learning against a set of classical navigation approaches in different social scenarios, while also analyzing the training and generalizability performance from the learning perspective. We also conduct a human study on how learning with multimodal perception affects the perceived social compliance. The results show that multimodal learning has a clear advantage over unimodal learning in both dataset and human studies. We open-source our code for the community's future use to study multimodal perception for learning social robot navigation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (76)
  1. D. Fox, W. Burgard, and S. Thrun, “The dynamic window approach to collision avoidance,” IEEE Robotics & Automation Magazine, vol. 4, no. 1, pp. 23–33, 1997.
  2. S. Quinlan and O. Khatib, “Elastic bands: Connecting path planning and control,” in [1993] Proceedings IEEE International Conference on Robotics and Automation.   IEEE, 1993, pp. 802–807.
  3. D. Perille, A. Truong, X. Xiao, and P. Stone, “Benchmarking metric ground navigation,” in 2020 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).   IEEE, 2020, pp. 116–121.
  4. A. Nair, F. Jiang, K. Hou, Z. Xu, S. Li, X. Xiao, and P. Stone, “Dynabarn: Benchmarking metric ground navigation in dynamic environments,” in 2022 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).   IEEE, 2022, pp. 347–352.
  5. X. Xiao, Z. Xu, Z. Wang, Y. Song, G. Warnell, P. Stone, T. Zhang, S. Ravi, G. Wang, H. Karnan et al., “Autonomous ground navigation in highly constrained spaces: Lessons learned from the benchmark autonomous robot navigation challenge at icra 2022 [competitions],” IEEE Robotics & Automation Magazine, vol. 29, no. 4, pp. 148–156, 2022.
  6. X. Xiao, Z. Xu, G. Warnell, P. Stone, F. G. Guinjoan, R. T. Rodrigues, H. Bruyninckx, H. Mandala, G. Christmann, J. L. Blanco-Claraco et al., “Autonomous ground navigation in highly constrained spaces: Lessons learned from the 2nd barn challenge at icra 2023,” arXiv preprint arXiv:2308.03205, 2023.
  7. C. Mavrogiannis, F. Baldini, A. Wang, D. Zhao, P. Trautman, A. Steinfeld, and J. Oh, “Core challenges of social robot navigation: A survey,” ACM Transactions on Human-Robot Interaction, vol. 12, no. 3, pp. 1–39, 2023.
  8. R. Mirsky, X. Xiao, J. Hart, and P. Stone, “Conflict avoidance in social navigation–a survey,” arXiv preprint arXiv:2106.12113, 2021.
  9. A. Francis, C. Pérez-d’Arpino, C. Li, F. Xia, A. Alahi, R. Alami, A. Bera, A. Biswas, J. Biswas, R. Chandra et al., “Principles and guidelines for evaluating social robot navigation algorithms,” arXiv preprint arXiv:2306.16740, 2023.
  10. X. Xiao, B. Liu, G. Warnell, and P. Stone, “Motion planning and control for mobile robot navigation using machine learning: a survey,” Autonomous Robots, vol. 46, no. 5, pp. 569–597, 2022.
  11. D. Helbing and P. Molnar, “Social force model for pedestrian dynamics,” Physical review E, vol. 51, no. 5, p. 4282, 1995.
  12. J. Van Den Berg, S. J. Guy, M. Lin, and D. Manocha, “Reciprocal n-body collision avoidance,” in Robotics Research: The 14th International Symposium ISRR.   Springer, 2011, pp. 3–19.
  13. X. Xiao, T. Zhang, K. M. Choromanski, T.-W. E. Lee, A. Francis, J. Varley, S. Tu, S. Singh, P. Xu, F. Xia, S. M. Persson, L. Takayama, R. Frostig, J. Tan, C. Parada, and V. Sindhwani, “Learning model predictive controllers with real-time attention for real-world navigation,” in Conference on robot learning.   PMLR, 2022.
  14. H. Kretzschmar, M. Spies, C. Sprunk, and W. Burgard, “Socially compliant mobile robot navigation via inverse reinforcement learning,” The International Journal of Robotics Research, vol. 35, no. 11, pp. 1289–1307, 2016.
  15. X. Xiao, Z. Wang, Z. Xu, B. Liu, G. Warnell, G. Dhamankar, A. Nair, and P. Stone, “Appl: Adaptive planner parameter learning,” Robotics and Autonomous Systems, vol. 154, p. 104132, 2022.
  16. X. Xiao, B. Liu, G. Warnell, J. Fink, and P. Stone, “Appld: Adaptive planner parameter learning from demonstration,” IEEE Robotics and Automation Letters, vol. 5, no. 3, pp. 4541–4547, 2020.
  17. Z. Wang, X. Xiao, G. Warnell, and P. Stone, “Apple: Adaptive planner parameter learning from evaluative feedback,” IEEE Robotics and Automation Letters, vol. 6, no. 4, pp. 7744–7749, 2021.
  18. Z. Wang, X. Xiao, B. Liu, G. Warnell, and P. Stone, “Appli: Adaptive planner parameter learning from interventions,” in 2021 IEEE international conference on robotics and automation (ICRA).   IEEE, 2021, pp. 6079–6085.
  19. Z. Xu, G. Dhamankar, A. Nair, X. Xiao, G. Warnell, B. Liu, Z. Wang, and P. Stone, “Applr: Adaptive planner parameter learning from reinforcement,” in 2021 IEEE international conference on robotics and automation (ICRA).   IEEE, 2021, pp. 6086–6092.
  20. S. M. Fiore, T. J. Wiltshire, E. J. Lobato, F. G. Jentsch, W. H. Huang, and B. Axelrod, “Toward understanding social cues and signals in human–robot interaction: effects of robot gaze and proxemic behavior,” Frontiers in psychology, vol. 4, p. 859, 2013.
  21. J. Hart, R. Mirsky, X. Xiao, S. Tejeda, B. Mahajan, J. Goo, K. Baldauf, S. Owen, and P. Stone, “Using human-inspired signals to disambiguate navigational intentions,” in International Conference on Social Robotics.   Springer, 2020, pp. 320–331.
  22. H. Karnan, A. Nair, X. Xiao, G. Warnell, S. Pirk, A. Toshev, J. Hart, J. Biswas, and P. Stone, “Socially compliant navigation dataset (scand): A large-scale dataset of demonstrations for social navigation,” IEEE Robotics and Automation Letters, 2022.
  23. J. Buhmann, W. Burgard, A. B. Cremers, D. Fox, T. Hofmann, F. E. Schneider, J. Strikos, and S. Thrun, “The mobile robot rhino,” Ai Magazine, vol. 16, no. 2, pp. 31–31, 1995.
  24. S. Thrun, M. Beetz, M. Bennewitz, W. Burgard, A. B. Cremers, F. Dellaert, D. Fox, D. Haehnel, C. Rosenberg, N. Roy et al., “Probabilistic algorithms and the interactive museum tour-guide robot minerva,” The International Journal of Robotics Research, vol. 19, no. 11, pp. 972–999, 2000.
  25. J. Joseph, F. Doshi-Velez, A. S. Huang, and N. Roy, “A bayesian nonparametric approach to modeling motion patterns,” Autonomous Robots, vol. 31, no. 4, pp. 383–400, 2011.
  26. M. Bennewitz, W. Burgard, G. Cielniak, and S. Thrun, “Learning motion patterns of people for compliant robot motion,” The International Journal of Robotics Research, vol. 24, no. 1, pp. 31–48, 2005.
  27. M. Shiomi, F. Zanlungo, K. Hayashi, and T. Kanda, “Towards a socially acceptable collision avoidance for a mobile robot navigating among pedestrians using a pedestrian model,” International Journal of Social Robotics, vol. 6, no. 3, pp. 443–455, 2014.
  28. V. V. Unhelkar, C. Pérez-D’Arpino, L. Stirling, and J. A. Shah, “Human-robot co-navigation using anticipatory indicators of human walking motion,” in 2015 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2015, pp. 6183–6190.
  29. P. Xu, J.-B. Hayet, and I. Karamouzas, “Socialvae: Human trajectory prediction using timewise latents,” in European Conference on Computer Vision.   Springer, 2022, pp. 511–528.
  30. R. A. Knepper and D. Rus, “Pedestrian-inspired sampling-based multi-robot collision avoidance,” in 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.   IEEE, 2012, pp. 94–100.
  31. E. A. Sisbot, L. F. Marin-Urias, R. Alami, and T. Simeon, “A human aware mobile robot motion planner,” IEEE Transactions on Robotics, vol. 23, no. 5, pp. 874–883, 2007.
  32. M. Luber, J. A. Stork, G. D. Tipaldi, and K. O. Arras, “People tracking with human motion predictions from social forces,” in 2010 IEEE international conference on robotics and automation.   IEEE, 2010, pp. 464–469.
  33. H. Gupta, B. Hayes, and Z. Sunberg, “Intention-aware navigation in crowds with extended-space pomdp planning,” in Proceedings of the 21st International Conference on Autonomous Agents and Multiagent Systems, 2022, pp. 562–570.
  34. P. Xu and I. Karamouzas, “Pfpn: Continuous control of physically simulated characters using particle filtering policy network,” in Proceedings of the 14th ACM SIGGRAPH Conference on Motion, Interaction and Games, 2021, pp. 1–12.
  35. ——, “Human-inspired multi-agent navigation using knowledge distillation,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 8105–8112.
  36. X. T. Truong, Y. S. Ou, and T.-D. Ngo, “Towards culturally aware robot navigation,” in 2016 IEEE International Conference on Real-time Computing and Robotics (RCAR).   IEEE, 2016, pp. 63–69.
  37. R. Kirby, R. Simmons, and J. Forlizzi, “Companion: A constraint-optimizing method for person-acceptable navigation,” in RO-MAN 2009-The 18th IEEE International Symposium on Robot and Human Interactive Communication.   IEEE, 2009, pp. 607–612.
  38. L. Takayama, D. Dooley, and W. Ju, “Expressing thought: improving robot readability with animation principles,” in 2011 6th ACM/IEEE International Conference on Human-Robot Interaction (HRI).   IEEE, 2011, pp. 69–76.
  39. E. Torta, R. H. Cuijpers, and J. F. Juola, “Design of a parametric model of personal space for robotic social navigation,” International Journal of Social Robotics, vol. 5, no. 3, pp. 357–365, 2013.
  40. A. D. Dragan, K. C. Lee, and S. S. Srinivasa, “Legibility and predictability of robot motion,” in 2013 8th ACM/IEEE International Conference on Human-Robot Interaction (HRI).   IEEE, 2013, pp. 301–308.
  41. C. I. Mavrogiannis, W. B. Thomason, and R. A. Knepper, “Social momentum: A framework for legible navigation in dynamic multi-agent environments,” in Proceedings of the 2018 ACM/IEEE International Conference on Human-Robot Interaction, 2018, pp. 361–369.
  42. M. Vázquez, E. J. Carter, J. A. Vaz, J. Forlizzi, A. Steinfeld, and S. E. Hudson, “Social group interactions in a role-playing game,” in Proceedings of the Tenth Annual ACM/IEEE International Conference on Human-Robot Interaction Extended Abstracts, 2015, pp. 9–10.
  43. J. Vroon, M. Joosse, M. Lohse, J. Kolkmeier, J. Kim, K. Truong, G. Englebienne, D. Heylen, and V. Evers, “Dynamics of social positioning patterns in group-robot interactions,” in 2015 24th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN).   IEEE, 2015, pp. 394–399.
  44. X. Xiao, J. Biswas, and P. Stone, “Learning inverse kinodynamics for accurate high-speed off-road navigation on unstructured terrain,” IEEE Robotics and Automation Letters, vol. 6, no. 3, pp. 6054–6060, 2021.
  45. H. Karnan, K. S. Sikand, P. Atreya, S. Rabiee, X. Xiao, G. Warnell, P. Stone, and J. Biswas, “Vi-ikd: High-speed accurate off-road navigation using learned visual-inertial inverse kinodynamics,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 3294–3301.
  46. P. Atreya, H. Karnan, K. S. Sikand, X. Xiao, S. Rabiee, and J. Biswas, “High-speed accurate robot control using learned forward kinodynamics and non-linear least squares optimization,” in 2022 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2022, pp. 11 789–11 795.
  47. K. S. Sikand, S. Rabiee, A. Uccello, X. Xiao, G. Warnell, and J. Biswas, “Visual representation learning for preference-aware path planning,” in 2022 International Conference on Robotics and Automation (ICRA).   IEEE, 2022, pp. 11 303–11 309.
  48. A. Datar, C. Pan, and X. Xiao, “Learning to model and plan for wheeled mobility on vertically challenging terrain,” arXiv preprint arXiv:2306.11611, 2023.
  49. A. Datar, C. Pan, M. Nazeri, and X. Xiao, “Toward wheeled mobility on vertically challenging terrain: Platforms, datasets, and algorithms,” arXiv preprint arXiv:2303.00998, 2023.
  50. B. Kim and J. Pineau, “Socially adaptive path planning in human environments using inverse reinforcement learning,” International Journal of Social Robotics, vol. 8, no. 1, pp. 51–66, 2016.
  51. D. Vasquez, B. Okal, and K. O. Arras, “Inverse reinforcement learning algorithms and features for robot navigation in crowds: an experimental comparison,” in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2014, pp. 1341–1346.
  52. B. D. Ziebart, N. Ratliff, G. Gallagher, C. Mertz, K. Peterson, J. A. Bagnell, M. Hebert, A. K. Dey, and S. Srinivasa, “Planning-based prediction for pedestrians,” in 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.   IEEE, 2009, pp. 3931–3936.
  53. J. Liang, U. Patel, A. J. Sathyamoorthy, and D. Manocha, “Crowd-steer: Realtime smooth and collision-free robot navigation in densely crowded scenarios trained using high-fidelity simulation,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 4221–4228.
  54. X. Xiao, B. Liu, G. Warnell, and P. Stone, “Toward agile maneuvers in highly constrained spaces: Learning from hallucination,” IEEE Robotics and Automation Letters, vol. 6, no. 2, pp. 1503–1510, 2021.
  55. X. Xiao, B. Liu, and P. Stone, “Agile robot navigation through hallucinated learning and sober deployment,” in 2021 IEEE international conference on robotics and automation (ICRA).   IEEE, 2021, pp. 7316–7322.
  56. Z. Wang, X. Xiao, A. J. Nettekoven, K. Umasankar, A. Singh, S. Bommakanti, U. Topcu, and P. Stone, “From agile ground to aerial navigation: Learning from learned hallucination,” in 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2021, pp. 148–153.
  57. A. Francis, A. Faust, H.-T. L. Chiang, J. Hsu, J. C. Kew, M. Fiser, and T.-W. E. Lee, “Long-range indoor navigation with prm-rl,” IEEE Transactions on Robotics, vol. 36, no. 4, pp. 1115–1134, 2020.
  58. Z. Xu, X. Xiao, G. Warnell, A. Nair, and P. Stone, “Machine learning methods for local motion planning: A study of end-to-end vs. parameter learning,” in 2021 IEEE International Symposium on Safety, Security, and Rescue Robotics (SSRR).   IEEE, 2021, pp. 217–222.
  59. Y. F. Chen, M. Everett, M. Liu, and J. P. How, “Socially aware motion planning with deep reinforcement learning,” in 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2017, pp. 1343–1350.
  60. L. Tai, J. Zhang, M. Liu, and W. Burgard, “Socially compliant navigation through raw depth inputs with generative adversarial imitation learning,” in 2018 IEEE international conference on robotics and automation (ICRA).   IEEE, 2018, pp. 1111–1117.
  61. M. Pfeiffer, M. Schaeuble, J. Nieto, R. Siegwart, and C. Cadena, “From perception to decision: A data-driven approach to end-to-end motion planning for autonomous ground robots,” in 2017 ieee international conference on robotics and automation (icra).   IEEE, 2017, pp. 1527–1533.
  62. Z. Xu, B. Liu, X. Xiao, A. Nair, and P. Stone, “Benchmarking reinforcement learning techniques for autonomous navigation,” in 2023 IEEE International Conference on Robotics and Automation (ICRA).   IEEE, 2023, pp. 9224–9230.
  63. D. M. Nguyen, M. Nazeri, A. Payandeh, A. Datar, and X. Xiao, “Toward human-like social robot navigation: A large-scale, multi-modal, social human navigation dataset,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2023.
  64. M. Bojarski, D. Del Testa, D. Dworakowski, B. Firner, B. Flepp, P. Goyal, L. D. Jackel, M. Monfort, U. Muller, J. Zhang et al., “End to end learning for self-driving cars,” arXiv preprint arXiv:1604.07316, 2016.
  65. M. H. Nazeri and M. Bohlouli, “Exploring reflective limitation of behavior cloning in autonomous vehicles,” in 2021 IEEE International Conference on Data Mining (ICDM).   IEEE, 2021, pp. 1252–1257.
  66. D. Ramachandram and G. W. Taylor, “Deep multimodal learning: A survey on recent advances and trends,” IEEE signal processing magazine, vol. 34, no. 6, pp. 96–108, 2017.
  67. K. Weerakoon, A. J. Sathyamoorthy, J. Liang, T. Guan, U. Patel, and D. Manocha, “Graspe: Graph based multimodal fusion for robot navigation in unstructured outdoor environments,” 2023.
  68. A. Nguyen, N. Nguyen, K. Tran, E. Tjiputra, and Q. D. Tran, “Autonomous navigation in complex environments with deep multimodal fusion network,” in 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).   IEEE, 2020, pp. 5824–5830.
  69. N. Srivastava and R. R. Salakhutdinov, “Multimodal learning with deep boltzmann machines,” Advances in neural information processing systems, vol. 25, 2012.
  70. A. Gupta, J. Johnson, L. Fei-Fei, S. Savarese, and A. Alahi, “Social gan: Socially acceptable trajectories with generative adversarial networks,” 2018.
  71. K. Li, M. Shan, K. Narula, S. Worrall, and E. Nebot, “Socially aware crowd navigation with multimodal pedestrian trajectory prediction for autonomous vehicles,” in 2020 IEEE 23rd International Conference on Intelligent Transportation Systems (ITSC).   IEEE, 2020, pp. 1–8.
  72. D. Maturana and S. Scherer, “Voxnet: A 3d convolutional neural network for real-time object recognition,” in 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS).   IEEE, 2015, pp. 922–928.
  73. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
  74. C. R. Qi, H. Su, K. Mo, and L. J. Guibas, “Pointnet: Deep learning on point sets for 3d classification and segmentation,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 652–660.
  75. C. R. Qi, L. Yi, H. Su, and L. J. Guibas, “Pointnet++: Deep hierarchical feature learning on point sets in a metric space,” Advances in neural information processing systems, vol. 30, 2017.
  76. S. Pirk, E. Lee, X. Xiao, L. Takayama, A. Francis, and A. Toshev, “A protocol for validating social navigation policies,” arXiv preprint arXiv:2204.05443, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Bhabaranjan Panigrahi (1 paper)
  2. Amir Hossain Raj (8 papers)
  3. Mohammad Nazeri (11 papers)
  4. Xuesu Xiao (91 papers)
Citations (4)