Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

DISP6D: Disentangled Implicit Shape and Pose Learning for Scalable 6D Pose Estimation (2107.12549v2)

Published 27 Jul 2021 in cs.CV

Abstract: Scalable 6D pose estimation for rigid objects from RGB images aims at handling multiple objects and generalizing to novel objects. Building on a well-known auto-encoding framework to cope with object symmetry and the lack of labeled training data, we achieve scalability by disentangling the latent representation of auto-encoder into shape and pose sub-spaces. The latent shape space models the similarity of different objects through contrastive metric learning, and the latent pose code is compared with canonical rotations for rotation retrieval. Because different object symmetries induce inconsistent latent pose spaces, we re-entangle the shape representation with canonical rotations to generate shape-dependent pose codebooks for rotation retrieval. We show state-of-the-art performance on two benchmarks containing textureless CAD objects without category and daily objects with categories respectively, and further demonstrate improved scalability by extending to a more challenging setting of daily objects across categories.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Yilin Wen (12 papers)
  2. Xiangyu Li (53 papers)
  3. Hao Pan (94 papers)
  4. Lei Yang (372 papers)
  5. Zheng Wang (400 papers)
  6. Taku Komura (66 papers)
  7. Wenping Wang (184 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.