Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DialFRED: Dialogue-Enabled Agents for Embodied Instruction Following (2202.13330v2)

Published 27 Feb 2022 in cs.AI and cs.RO

Abstract: Language-guided Embodied AI benchmarks requiring an agent to navigate an environment and manipulate objects typically allow one-way communication: the human user gives a natural language command to the agent, and the agent can only follow the command passively. We present DialFRED, a dialogue-enabled embodied instruction following benchmark based on the ALFRED benchmark. DialFRED allows an agent to actively ask questions to the human user; the additional information in the user's response is used by the agent to better complete its task. We release a human-annotated dataset with 53K task-relevant questions and answers and an oracle to answer questions. To solve DialFRED, we propose a questioner-performer framework wherein the questioner is pre-trained with the human-annotated data and fine-tuned with reinforcement learning. We make DialFRED publicly available and encourage researchers to propose and evaluate their solutions to building dialog-enabled embodied agents.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xiaofeng Gao (53 papers)
  2. Qiaozi Gao (20 papers)
  3. Ran Gong (17 papers)
  4. Kaixiang Lin (22 papers)
  5. Govind Thattai (25 papers)
  6. Gaurav S. Sukhatme (88 papers)
Citations (62)

Summary

We haven't generated a summary for this paper yet.