Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Adapt2Reward: Adapting Video-Language Models to Generalizable Robotic Rewards via Failure Prompts (2407.14872v1)

Published 20 Jul 2024 in cs.CV and cs.RO

Abstract: For a general-purpose robot to operate in reality, executing a broad range of instructions across various environments is imperative. Central to the reinforcement learning and planning for such robotic agents is a generalizable reward function. Recent advances in vision-LLMs, such as CLIP, have shown remarkable performance in the domain of deep learning, paving the way for open-domain visual recognition. However, collecting data on robots executing various language instructions across multiple environments remains a challenge. This paper aims to transfer video-LLMs with robust generalization into a generalizable language-conditioned reward function, only utilizing robot video data from a minimal amount of tasks in a singular environment. Unlike common robotic datasets used for training reward functions, human video-language datasets rarely contain trivial failure videos. To enhance the model's ability to distinguish between successful and failed robot executions, we cluster failure video features to enable the model to identify patterns within. For each cluster, we integrate a newly trained failure prompt into the text encoder to represent the corresponding failure mode. Our language-conditioned reward function shows outstanding generalization to new environments and new instructions for robot planning and reinforcement learning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Yanting Yang (10 papers)
  2. Minghao Chen (37 papers)
  3. Qibo Qiu (11 papers)
  4. Jiahao Wu (45 papers)
  5. Wenxiao Wang (63 papers)
  6. Binbin Lin (50 papers)
  7. Ziyu Guan (34 papers)
  8. Xiaofei He (70 papers)

Summary

We haven't generated a summary for this paper yet.

Youtube Logo Streamline Icon: https://streamlinehq.com