Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FollowNet: Robot Navigation by Following Natural Language Directions with Deep Reinforcement Learning (1805.06150v1)

Published 16 May 2018 in cs.RO, cs.AI, cs.CL, and cs.LG

Abstract: Understanding and following directions provided by humans can enable robots to navigate effectively in unknown situations. We present FollowNet, an end-to-end differentiable neural architecture for learning multi-modal navigation policies. FollowNet maps natural language instructions as well as visual and depth inputs to locomotion primitives. FollowNet processes instructions using an attention mechanism conditioned on its visual and depth input to focus on the relevant parts of the command while performing the navigation task. Deep reinforcement learning (RL) a sparse reward learns simultaneously the state representation, the attention function, and control policies. We evaluate our agent on a dataset of complex natural language directions that guide the agent through a rich and realistic dataset of simulated homes. We show that the FollowNet agent learns to execute previously unseen instructions described with a similar vocabulary, and successfully navigates along paths not encountered during training. The agent shows 30% improvement over a baseline model without the attention mechanism, with 52% success rate at novel instructions.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Pararth Shah (13 papers)
  2. Marek Fiser (7 papers)
  3. Aleksandra Faust (60 papers)
  4. J. Chase Kew (7 papers)
  5. Dilek Hakkani-Tur (94 papers)
Citations (51)

Summary

We haven't generated a summary for this paper yet.