Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HabiCrowd: A High Performance Simulator for Crowd-Aware Visual Navigation (2306.11377v2)

Published 20 Jun 2023 in cs.CV

Abstract: Visual navigation, a foundational aspect of Embodied AI (E-AI), has been significantly studied in the past few years. While many 3D simulators have been introduced to support visual navigation tasks, scarcely works have been directed towards combining human dynamics, creating the gap between simulation and real-world applications. Furthermore, current 3D simulators incorporating human dynamics have several limitations, particularly in terms of computational efficiency, which is a promise of E-AI simulators. To overcome these shortcomings, we introduce HabiCrowd, the first standard benchmark for crowd-aware visual navigation that integrates a crowd dynamics model with diverse human settings into photorealistic environments. Empirical evaluations demonstrate that our proposed human dynamics model achieves state-of-the-art performance in collision avoidance, while exhibiting superior computational efficiency compared to its counterparts. We leverage HabiCrowd to conduct several comprehensive studies on crowd-aware visual navigation tasks and human-robot interactions. The source code and data can be found at https://habicrowd.github.io/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (75)
  1. Social lstm: Human trajectory prediction in crowded spaces. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016.
  2. Learning to predict human behavior in crowded scenes. In Group and Crowd Behavior for Computer Vision. Elsevier, 2017.
  3. Real-time visual predictive controller for image-based trajectory tracking of a mobile robot. IFAC Proceedings Volumes, 2008.
  4. On evaluation of embodied navigation agents. arXiv preprint arXiv:1807.06757, 2018.
  5. Vector-based navigation using grid-like representations in artificial agents. Nature, 2018.
  6. Combining optimal control and learning for visual navigation in novel environments. In Conference on Robot Learning, 2020.
  7. Learning motion patterns of people for compliant robot motion. The International Journal of Robotics Research, 2005.
  8. Visual navigation for mobile robots: A survey. Journal of Intelligent and Robotic Systems, 2008.
  9. Real-time obstacle avoidance for fast mobile robots. IEEE Transactions on Systems, Man, and Cybernetics, 1989.
  10. Makehuman: A review of the modelling framework. In Congress of the International Ergonomics Association, 2018.
  11. Vision-based model predictive control for within-hand precision manipulation with underactuated grippers. In IEEE International Conference on Robotics and Automation, 2017.
  12. Object goal navigation using goal-oriented semantic exploration. Advances in Neural Information Processing Systems, 2020.
  13. Soundspaces: Audio-visual navigation in 3d environments. In European Conference on Computer Vision, 2020.
  14. Crowd-robot interaction: Crowd-aware robot navigation with attention-based deep reinforcement learning. In IEEE International Conference on Robotics and Automation, 2019.
  15. Socially aware motion planning with deep reinforcement learning. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2017.
  16. NVIDIA Corporation. Nvidia omniverse platform for creating and operating metaverse applications, 2023. Accessed: May 15th 2023. [Online]. Available: https://www.nvidia.com/en-us/omniverse/.
  17. Pybullet, a python module for physics simulation for games, robotics and machine learning, 2019. Accessed: April 25th 2023. [Online]. Available: https://pybullet.org/wordpress/.
  18. Retrospectives on the embodied ai workshop. arXiv preprint arXiv:2210.06849, 2022.
  19. Drivers Jonas Deloitte. Employment densities guide, 2010. Accessed: May 29th 2023. [Online]. Available: https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/378203/employ-den.pdf.
  20. Aggressive deep driving: Combining convolutional neural networks and model predictive control. In Conference on Robot Learning, 2017.
  21. Navdreams: Towards camera-only rl navigation among humans. arXiv preprint arXiv:2203.12299, 2022.
  22. Manipulathor: A framework for visual object manipulation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
  23. Robot companion: A social-force based approach with human awareness-navigation in crowded environments. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2013.
  24. Deep visual foresight for planning robot motion. In IEEE International Conference on Robotics and Automation, 2017.
  25. Human pose as compositional tokens. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  26. Cognitive mapping and planning for visual navigation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017.
  27. Deep visual mpc-policy learning for navigation. IEEE Robotics and Automation Letters, 2019.
  28. Cathsim: An open-source simulator for autonomous cannulation. arXiv preprint arXiv:2208.01455, 2022.
  29. Universal power law governing pedestrian interactions. Physical Review Letters, 2014.
  30. Deep drone racing: Learning agile flight in dynamic environments. In Conference on Robot Learning, 2018.
  31. Simple but effective: Clip embeddings for embodied ai. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
  32. Design and use paradigms for gazebo, an open-source multi-robot simulator. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2004.
  33. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474, 2017.
  34. Socially compliant mobile robot navigation via inverse reinforcement learning. The International Journal of Robotics Research, 2016.
  35. Human-aware robot navigation: A survey. Robotics and Autonomous Systems, 2013.
  36. iGibson Challenge 2021, 2021.
  37. iGibson Challenge 2022, 2022.
  38. Vision-based model predictive control for steering of a nonholonomic mobile robot. IEEE Transactions on Control Systems Technology, 2015.
  39. Decentralized structural-rnn for robot crowd navigation with deep reinforcement learning. In IEEE International Conference on Robotics and Automation, 2021.
  40. Perception of pedestrian avoidance strategies of a self-balancing mobile robot. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2019.
  41. Stability properties of multi-stage nonlinear model predictive control. Systems & Control Letters, 2020.
  42. Using rgb image as visual input for mapless robot navigation. arXiv preprint arXiv:1903.09927, 2019.
  43. Isaac gym: High performance gpu-based physics simulation for robot learning. arXiv preprint arXiv:2108.10470, 2021.
  44. Human-level control through deep reinforcement learning. Nature, 2015.
  45. A survey on human-aware robot navigation. Robotics and Autonomous Systems, 2021.
  46. Dipcan: Distilling privileged information for crowd-aware navigation. Robotics: Science and Systems (RSS) XVIII, 2022.
  47. Deep federated learning for autonomous driving. In 2022 IEEE Intelligent Vehicles Symposium (IV), 2022.
  48. Autonomous navigation in complex environments with deep multimodal fusion network. In 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems, 2020.
  49. Vision-based navigation with language-based assistance via imitation learning with indirect intervention. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.
  50. Help, anna! visual navigation with natural multimodal assistance via retrospective curiosity-encouraging imitation learning. In Conference on Empirical Methods in Natural Language Processing and International Joint Conference on Natural Language Processing, 2019.
  51. Unrealcv: Virtual worlds for computer vision. In ACM International Conference on Multimedia, 2017.
  52. Habitat-matterport 3d dataset (hm3d): 1000 large-scale 3d environments for embodied ai. In Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2021.
  53. Poni: Potential functions for objectgoal navigation with interaction-free learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
  54. Pirlnav: Pretraining with imitation and rl finetuning for objectnav. arXiv preprint arXiv:2301.07302, 2023.
  55. Habitat-web: Learning embodied object-search strategies from human demonstrations at scale. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
  56. Model predictive control: Theory, computation, and design. Nob Hill Publishing Madison, WI, 2017.
  57. Bayesian learning for safe high-speed navigation in unknown environments. In Robotics Research. Springer, 2018.
  58. Image based visual servoing through nonlinear model predictive control. In Conference on Decision and Control, 2006.
  59. Habitat: A platform for embodied ai research. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
  60. Behavior: Benchmark for everyday household activities in virtual, interactive, and ecological environments. In Conference on Robot Learning, 2022.
  61. Habitat 2.0: Training home assistants to rearrange their habitat. Advances in Neural Information Processing Systems, 2021.
  62. V Tolani and et.al. Visual navigation among humans with optimal control as a supervisor. IEEE Robotics and Automation Letters, 2021.
  63. Unfreezing the robot: Navigation in dense, interacting crowds. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2010.
  64. Real time crowd navigation from first principles of probability theory. In International Conference on Automated Planning and Scheduling, 2020.
  65. J Van Den Berg and et.al. Reciprocal n-body collision avoidance. In International Symposium on Robotics Research, 2011.
  66. Generalized microscropic crowd simulation using costs in velocity space. In Symposium on Interactive 3D Graphics and Games, 2020.
  67. H Durrant Whyte. Simultaneous localisation and mapping (slam): Part i the essential algorithms. Robotics and Automation Magazine, 2006.
  68. DD-PPO: learning near-perfect pointgoal navigators from 2.5 billion frames. In International Conference on Learning Representations, 2020.
  69. Gibson env: Real-world perception for embodied agents. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.
  70. Ovrl-v2: A simple state-of-art baseline for imagenav and objectnav. arXiv preprint arXiv:2303.07798, 2023.
  71. Habitat-matterport 3d semantics dataset. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  72. Visual semantic navigation using scene priors. In International Conference on Learning Representations, 2019.
  73. Mime: Human-aware 3d scene generation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
  74. N Yokoyama and et.al. Benchmarking augmentation methods for learning robust navigation agents: the winning entry of the 2021 igibson challenge. In IEEE/RSJ International Conference on Intelligent Robots and Systems, 2022.
  75. Two-step online trajectory planning of a quadcopter in indoor environments with obstacles. arXiv preprint arXiv:2211.06377, 2022.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. An Dinh Vuong (4 papers)
  2. Toan Tien Nguyen (2 papers)
  3. Baoru Huang (41 papers)
  4. Dzung Nguyen (3 papers)
  5. Huynh Thi Thanh Binh (14 papers)
  6. Thieu Vo (13 papers)
  7. Anh Nguyen (157 papers)
  8. Minh Nhat Vu (36 papers)
Citations (4)

Summary

We haven't generated a summary for this paper yet.