Robust 6D Object Pose Estimation by Learning RGB-D Features (2003.00188v2)

Published 29 Feb 2020 in cs.CV

Abstract: Accurate 6D object pose estimation is fundamental to robotic manipulation and grasping. Previous methods follow a local optimization approach which minimizes the distance between closest point pairs to handle the rotation ambiguity of symmetric objects. In this work, we propose a novel discrete-continuous formulation for rotation regression to resolve this local-optimum problem. We uniformly sample rotation anchors in SO(3), and predict a constrained deviation from each anchor to the target, as well as uncertainty scores for selecting the best prediction. Additionally, the object location is detected by aggregating point-wise vectors pointing to the 3D center. Experiments on two benchmarks: LINEMOD and YCB-Video, show that the proposed method outperforms state-of-the-art approaches. Our code is available at https://github.com/mentian/object-posenet.

PDF Abstract

Overview of the Paper "Robust 6D Object Pose Estimation by Learning RGB-D Features"

The presented paper explores a novel methodology for estimating 6D object poses with a focus on robustness, especially in challenging environments that include varying lighting conditions, background clutter, and object occlusions. This work addresses the task of estimating both the translation and rotation of an object, using RGB-D inputs to improve prediction accuracy over methods reliant solely on RGB data. The paper proposes a discrete-continuous formulation tailored to manage the local-optimum pitfalls associated with the conventional ShapeMatch-Loss when processing symmetric objects.

Key Contributions

Discrete-Continuous Rotation Regression: The authors introduce a rotation prediction mechanism that integrates uniform sampling of rotation anchors in the continuous SO(3) domain, complemented with a local deviation prediction from these anchors. This approach effectively diversifies the rotational space, thus mitigating the convergence problems that symmetric objects pose during training.
Utilization of RGB-D Features: By leveraging densely extracted features from RGB-D data, the proposed network architecture effectively employs geometric cues from depth data. This design enhances robustness, addressing intrinsic limitations of RGB-only methods, which often suffer under diverse illumination and appearance changes.
Dual-Branch Network for Decoupled Estimation: The proposed architecture employs separate branches for estimating translation and rotation. The translation is computed using a RANSAC-based voting technique over point-wise predictions, ensuring robustness against occlusions and background interference.

Experimental Validation

The method is evaluated against two benchmarks, LINEMOD and YCB-Video, demonstrating superior performance over existing state-of-the-art approaches. On the LINEMOD dataset, the proposed method achieves a notable increase in accuracy (ADD: 92.8% vs. 86.3%), particularly excelling in challenging scenarios involving small, textureless objects. On the YCB-Video dataset, the detection accuracy jumps to 83.8% in terms of ADD metric, surpassing the previous best methods by 4.6%.

Technical Implications and Future Directions

The incorporation of rotation anchors offers a scalable solution to symmetry-induced ambiguities, while computational efficiency for real-time application is achieved without sacrificing accuracy. The dual-branch network architecture suggests a promising direction for decoupled object pose estimation, offering a template for separating complex transformations in other vision tasks.

Looking to the future, the paper hints at extending the approach to operate purely on synthetic datasets, which could greatly reduce the reliance on annotated real-world data and facilitate broader applicability. Additionally, exploring the role of uncertainty scores in pose refinement processes or integrating them into robotic grasping strategies presents exciting avenues for further research.

The work stands as a robust contribution to robotics and computer vision, offering valuable insights into overcoming traditional challenges associated with 6D pose estimation tasks, especially in complex and dynamic environments.

PDF Markdown Bookmark Chat (Pro)

Authors (4)

Meng Tian (25 papers)
Liang Pan (93 papers)
Gim Hee Lee (135 papers)
Marcelo H Ang Jr (9 papers)

Citations (46)

View on Semantic Scholar

Robust 6D Object Pose Estimation by Learning RGB-D Features (2003.00188v2)

Overview of the Paper "Robust 6D Object Pose Estimation by Learning RGB-D Features"

Key Contributions

Experimental Validation

Technical Implications and Future Directions

Related Papers

GitHub

YouTube