Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Supervised Learning of Multi-Object Keypoints for Robotic Manipulation (2205.08316v2)

Published 17 May 2022 in cs.RO and cs.CV

Abstract: In recent years, policy learning methods using either reinforcement or imitation have made significant progress. However, both techniques still suffer from being computationally expensive and requiring large amounts of training data. This problem is especially prevalent in real-world robotic manipulation tasks, where access to ground truth scene features is not available and policies are instead learned from raw camera observations. In this paper, we demonstrate the efficacy of learning image keypoints via the Dense Correspondence pretext task for downstream policy learning. Extending prior work to challenging multi-object scenes, we show that our model can be trained to deal with important problems in representation learning, primarily scale-invariance and occlusion. We evaluate our approach on diverse robot manipulation tasks, compare it to other visual representation learning approaches, and demonstrate its flexibility and effectiveness for sample-efficient policy learning.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Jan Ole von Hartz (7 papers)
  2. Eugenio Chisari (11 papers)
  3. Tim Welschehold (27 papers)
  4. Abhinav Valada (117 papers)
Citations (5)

Summary

We haven't generated a summary for this paper yet.