VLM-Social-Nav: Socially Aware Robot Navigation through Scoring using Vision-Language Models (2404.00210v3)

Published 30 Mar 2024 in cs.RO

Abstract: We propose VLM-Social-Nav, a novel Vision-LLM (VLM) based navigation approach to compute a robot's motion in human-centered environments. Our goal is to make real-time decisions on robot actions that are socially compliant with human expectations. We utilize a perception model to detect important social entities and prompt a VLM to generate guidance for socially compliant robot behavior. VLM-Social-Nav uses a VLM-based scoring module that computes a cost term that ensures socially appropriate and effective robot actions generated by the underlying planner. Our overall approach reduces reliance on large training datasets and enhances adaptability in decision-making. In practice, it results in improved socially compliant navigation in human-shared environments. We demonstrate and evaluate our system in four different real-world social navigation scenarios with a Turtlebot robot. We observe at least 27.38% improvement in the average success rate and 19.05% improvement in the average collision rate in the four social navigation scenarios. Our user study score shows that VLM-Social-Nav generates the most socially compliant navigation behavior.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (57)

Authors (6)

Daeun Song (12 papers)
Jing Liang (89 papers)
Amirreza Payandeh (9 papers)
Xuesu Xiao (91 papers)
Dinesh Manocha (366 papers)
Amir Hossain Raj (8 papers)

Citations (3)

View on Semantic Scholar

Tweets

https://twitter.com/OWW/status/1775262982756594010

VLM-Social-Nav: Socially Aware Robot Navigation through Scoring using Vision-Language Models (2404.00210v3)

Related Papers

Tweets