Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Mitigating the Human-Robot Domain Discrepancy in Visual Pre-training for Robotic Manipulation (2406.14235v2)

Published 20 Jun 2024 in cs.CV and cs.RO

Abstract: Learning generalizable visual representations across different embodied environments is essential for effective robotic manipulation in real-world scenarios. However, the limited scale and diversity of robot demonstration data pose a significant challenge. Recent research has explored leveraging large-scale human activity data for pre-training, but the substantial morphological differences between humans and robots introduce a significant human-robot domain discrepancy, hindering the generalization of these models to downstream manipulation tasks. To overcome this, we propose a novel adaptation paradigm that leverages readily available paired human-robot video data to bridge the domain gap. Our method employs a human-robot contrastive alignment loss to align the semantics of human and robot videos, adapting pre-trained models to the robot domain in a parameter-efficient manner. Experiments on 20 simulated tasks across two different benchmarks and five real-world tasks demonstrate significant improvements. These results span both single-task and language-conditioned multi-task settings, evaluated using two different pre-trained models. Compared to existing pre-trained models, our adaptation method improves the average success rate by over $7\%$ across multiple tasks on both simulated benchmarks and real-world evaluations. We will release the code and models.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Jiaming Zhou (41 papers)
  2. Teli Ma (22 papers)
  3. Kun-Yu Lin (24 papers)
  4. Ronghe Qiu (7 papers)
  5. Zifan Wang (75 papers)
  6. Junwei Liang (47 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.