Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EPOS: Estimating 6D Pose of Objects with Symmetries (2004.00605v1)

Published 1 Apr 2020 in cs.CV, cs.LG, cs.RO, and eess.IV

Abstract: We present a new method for estimating the 6D pose of rigid objects with available 3D models from a single RGB input image. The method is applicable to a broad range of objects, including challenging ones with global or partial symmetries. An object is represented by compact surface fragments which allow handling symmetries in a systematic manner. Correspondences between densely sampled pixels and the fragments are predicted using an encoder-decoder network. At each pixel, the network predicts: (i) the probability of each object's presence, (ii) the probability of the fragments given the object's presence, and (iii) the precise 3D location on each fragment. A data-dependent number of corresponding 3D locations is selected per pixel, and poses of possibly multiple object instances are estimated using a robust and efficient variant of the PnP-RANSAC algorithm. In the BOP Challenge 2019, the method outperforms all RGB and most RGB-D and D methods on the T-LESS and LM-O datasets. On the YCB-V dataset, it is superior to all competitors, with a large margin over the second-best RGB method. Source code is at: cmp.felk.cvut.cz/epos.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Tomas Hodan (22 papers)
  2. Daniel Barath (71 papers)
  3. Jiri Matas (133 papers)
Citations (217)

Summary

EPOS: Estimating 6D Pose of Objects with Symmetries

The paper presents EPOS, a method for estimating the 6D pose of rigid objects with available 3D models from a single RGB input image, specifically focusing on objects with global or partial symmetries. Traditional model-based approaches to 6D pose estimation establish 2D-3D correspondences between the input image and the object model, solving an optimization problem using methods such as PnP-RANSAC. However, handling symmetries systematically and effectively remains a major challenge.

EPOS introduces a novel methodology of representing objects using compact surface fragments. This representation enables the systematic handling of both global and partial symmetries and ensures uniform coverage of candidate 3D locations. An encoder-decoder convolutional neural network predicts dense correspondences between image pixels and fragments. For each pixel, the network computes: (i) the probability of the presence of each object, (ii) the probability of corresponding fragments given the presence of an object, and (iii) the precise 3D location on each fragment.

A key contribution of EPOS is the prediction of many-to-many 2D-3D correspondences to account for symmetries, which degrades the performance of methods assuming a one-to-one correspondence model. The authors propose a robust and efficient variant of the PnP-RANSAC algorithm within the Progressive-X framework, which also employs the spatial coherence of correspondences to improve pose estimation.

Numerical Results and Claims:

  • EPOS demonstrates superior performance by outperforming all RGB methods and most RGB-D and D methods on datasets T-LESS, LM-O, and YCB-V from the BOP Challenge 2019. On YCB-V, it achieves a significant 27% absolute improvement over the second-best RGB method.
  • The method achieved state-of-the-art results on T-LESS and LM-O, datasets known for their challenges due to texture-less and symmetric objects.

Implications and Future Work:

The implications of EPOS extend into realms including robotic manipulation, augmented reality, and autonomous driving, where accurate pose estimation of objects is crucial. The systematic handling of symmetries using surface fragments could inspire future approaches in object representation and correspondence learning. Furthermore, exploring the impact of object-specific fragmentation strategies or advancing the current method's speed without sacrificing accuracy are potential future research directions.

EPOS further establishes the utility and accuracy of RGB-only approaches for object pose estimation where depth data may not be readily available. This work highlights the necessity for robust model fitting algorithms in the presence of high outlier ratios inherent in the many-to-many correspondence problem.

In summary, EPOS is a significant step forward in object pose estimation, addressing a fundamental limitation in symmetric object handling while achieving impressive accuracy improvements over existing methods. Moreover, the method's potential enhancements provide a rich avenue for future exploration in AI and computer vision.