Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Vision-and-Language Navigation: A Survey of Tasks, Methods, and Future Directions (2203.12667v3)

Published 22 Mar 2022 in cs.CV, cs.AI, cs.CL, and cs.LG

Abstract: A long-term goal of AI research is to build intelligent agents that can communicate with humans in natural language, perceive the environment, and perform real-world tasks. Vision-and-Language Navigation (VLN) is a fundamental and interdisciplinary research topic towards this goal, and receives increasing attention from natural language processing, computer vision, robotics, and machine learning communities. In this paper, we review contemporary studies in the emerging field of VLN, covering tasks, evaluation metrics, methods, etc. Through structured analysis of current progress and challenges, we highlight the limitations of current VLN and opportunities for future work. This paper serves as a thorough reference for the VLN research community.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Jing Gu (29 papers)
  2. Eliana Stefani (1 paper)
  3. Qi Wu (323 papers)
  4. Jesse Thomason (65 papers)
  5. Xin Eric Wang (74 papers)
Citations (89)

Summary

We haven't generated a summary for this paper yet.