Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sim-to-Real Transfer for Vision-and-Language Navigation (2011.03807v1)

Published 7 Nov 2020 in cs.CV, cs.CL, and cs.RO

Abstract: We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot's low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Peter Anderson (30 papers)
  2. Ayush Shrivastava (8 papers)
  3. Joanne Truong (12 papers)
  4. Arjun Majumdar (16 papers)
  5. Devi Parikh (129 papers)
  6. Dhruv Batra (160 papers)
  7. Stefan Lee (62 papers)
Citations (96)

Summary

We haven't generated a summary for this paper yet.