GeoReF: Geometric Alignment Across Shape Variation for Category-level Object Pose Refinement (2404.11139v1)
Abstract: Object pose refinement is essential for robust object pose estimation. Previous work has made significant progress towards instance-level object pose refinement. Yet, category-level pose refinement is a more challenging problem due to large shape variations within a category and the discrepancies between the target object and the shape prior. To address these challenges, we introduce a novel architecture for category-level object pose refinement. Our approach integrates an HS-layer and learnable affine transformations, which aims to enhance the extraction and alignment of geometric information. Additionally, we introduce a cross-cloud transformation mechanism that efficiently merges diverse data sources. Finally, we push the limits of our model by incorporating the shape prior information for translation and size error prediction. We conducted extensive experiments to demonstrate the effectiveness of the proposed framework. Through extensive quantitative experiments, we demonstrate significant improvement over the baseline method by a large margin across all metrics.
- Ove6d: Object viewpoint encoding for depth-based 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6803–6813, 2022.
- Crt-6d: Fast 6d object pose estimation with cascaded refinement transformers. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), pages 5746–5755, 2023.
- Learning canonical shape space for category-level 6d object pose and size estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020a.
- Epro-pnp: Generalized end-to-end probabilistic perspective-n-points for monocular object pose estimation, 2022.
- Texpose: Neural texture learning for self-supervised 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4841–4852, 2023a.
- SGPA: Structure-guided prior adaptation for category-level 6d object pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 2773–2782, 2021.
- Stereopose: Category-level 6d transparent object pose estimation from stereo images via back-view nocs. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 2855–2861, 2023b.
- G2L-Net: Global to Local Network for Real-Time 6D Pose Estimation With Embedding Vector Features. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020b.
- Fs-net: Fast shape-based network for category-level 6d object pose estimation with decoupled rotation mechanism. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1581–1590, 2021.
- GPV-Pose: Category-level object pose estimation via geometry-guided point-wise voting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6781–6791, 2022.
- Rigidity-aware detection for 6d object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8927–8936, 2023.
- Surfemb: Dense and continuous correspondence distributions for object pose estimation with learnt surface embeddings. CoRR, abs/2111.13489, 2021.
- Pvn3d: A deep point-wise 3d keypoints voting network for 6dof pose estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Going further with point pair features. In Computer Vision – ECCV 2016, pages 834–848. Springer International Publishing, 2016.
- Epos: Estimating 6d pose of objects with symmetries. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020.
- Shapo: Implicit representations for multi object shape appearance and pose optimization. 2022.
- Repose: Fast 6d object pose refinement via deep texture rendering. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3303–3312, 2021.
- Real-time perception meets reactive motion generation. IEEE Robotics and Automation Letters, 3(3):1864–1871, 2018.
- Pose estimation for an autonomous vehicle using monocular vision. pages 424–431, 2017.
- CosyPose: Consistent multi-view multi-object 6D pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV), 2020.
- UDA-COPE: unsupervised domain adaptation for category-level object pose estimation. CoRR, abs/2111.12580, 2021.
- Deepim: Deep iterative matching for 6d pose estimation. In Proceedings of the European Conference on Computer Vision (ECCV), 2018.
- Sar-net: Shape alignment and recovery network for category-level 6d object pose and size estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6707–6717, 2022a.
- DualPoseNet: Category-level 6d object pose and size estimation using dual pose network with refined learning of pose consistency. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3560–3569, 2021.
- Category-level 6d object pose and size estimation using self-supervised deep prior deformation networks, 2022b.
- Convolution in the cloud: Learning deformable kernels in 3d graph convolution networks for point cloud analysis. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1797–1806, 2020.
- Tp-ae: Temporally primed 6d object pose tracking with auto-encoders. In 2022 IEEE International Conference on Robotics and Automation (ICRA), 2022.
- On the variance of the adaptive learning rate and beyond, 2019.
- CATRE: Iterative point clouds alignment for category-level object pose refinement. In European Conference on Computer Vision (ECCV), pages 499–516. Springer, 2022.
- Eitan Marder-Eppstein. Project tango. pages 25–25, 2016.
- Symmetry and uncertainty-aware object slam for 6dof object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14901–14910, 2022.
- Vision-driven Compliant Manipulation for Reliable; High-Precision Assembly Tasks. In Proceedings of Robotics: Science and Systems, Virtual, 2021.
- Augmented reality applications in design and manufacturing. CIRP Annals - Manufacturing Technology, 61:657–679, 2012.
- Templates for 3d object pose estimation revisited: Generalization to new objects and robustness to occlusions, 2022.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- BB8: A Scalable, Accurate, Robust to Partial Occlusion Method for Predicting the 3D Poses of Challenging Objects Without Using Depth. In IEEE International Conference on Computer Vision (ICCV), 2017.
- Generalized-icp. In Robotics: science and systems (RSS), page 435. Seattle, WA, 2009.
- Osop: A multi-stage one shot object pose estimation framework. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 6835–6844, 2022.
- Deep multi-state object pose estimation for augmented reality assembly. In 2019 IEEE International Symposium on Mixed and Augmented Reality Adjunct (ISMAR-Adjunct), pages 222–227, 2019.
- Zebrapose: Coarse to fine surface encoding for 6dof object pose estimation, 2022.
- Onepose: One-shot object pose estimation without cad models. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6825–6834, 2022.
- Multi-path learning for object pose estimation across domains, 2019a.
- Augmented Autoencoders: Implicit 3D Orientation Learning for 6D Object Detection. International Journal of Computer Vision (IJCV), 128, 2019b.
- Real-time seamless single shot 6d object pose prediction. pages 292–301, 2018.
- Shape prior deformation for categorical 6d object pose and size estimation. In European Conference on Computer Vision (ECCV), pages 530–546. Springer, 2020.
- Deep object pose estimation for semantic robotic grasping of household objects, 2018.
- 6d pose estimation using an improved method based on point pair features, 2018.
- DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2019a.
- Normalized object coordinate space for category-level 6d object pose and size estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 2642–2651, 2019b.
- Category-level 6d object pose estimation via cascaded relation and recurrent reconstruction networks. In 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4807–4814. IEEE, 2021.
- se(3)-TrackNet: Data-driven 6D Pose Tracking by Calibrating Image Residuals in Synthetic Domains. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2020.
- You Only Demonstrate Once: Category-Level Manipulation from Single Visual Demonstration. In Proceedings of Robotics: Science and Systems, New York City, NY, USA, 2022.
- Object pose estimation with statistical guarantees: Conformal keypoint detection and geometric uncertainty propagation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 8947–8958, 2023.
- Gradient centralization: A new optimization technique for deep neural networks, 2020.
- DPOD: 6D Pose Object Detector and Refiner. In The IEEE International Conference on Computer Vision (ICCV), 2019.
- Vision-based six-dimensional peg-in-hole for practical connector insertion. In 2023 IEEE International Conference on Robotics and Automation (ICRA), pages 1771–1777, 2023a.
- Lookahead optimizer: k steps forward, 1 step back, 2019.
- RBP-Pose: Residual bounding box projection for category-level pose estimation, 2022a.
- SSP-Pose: Symmetry-aware shape prior deformation for direct category-level object pose estimation, 2022b.
- Trans6d: Transformer-based 6d object pose estimation and refinement. In Computer Vision – ECCV 2022 Workshops, pages 112–128, Cham, 2023b. Springer Nature Switzerland.
- HS-Pose: Hybrid scope feature extraction for category-level object pose estimation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17163–17173, 2023.
- Deep fusion transformer network with weighted vector-wise keypoints voting for robust 6d object pose estimation. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 13967–13977, 2023.