Learning SO(3)-Invariant Semantic Correspondence via Local Shape Transform (2404.11156v2)
Abstract: Establishing accurate 3D correspondences between shapes stands as a pivotal challenge with profound implications for computer vision and robotics. However, existing self-supervised methods for this problem assume perfect input shape alignment, restricting their real-world applicability. In this work, we introduce a novel self-supervised Rotation-Invariant 3D correspondence learner with Local Shape Transform, dubbed RIST, that learns to establish dense correspondences between shapes even under challenging intra-class variations and arbitrary orientations. Specifically, RIST learns to dynamically formulate an SO(3)-invariant local shape transform for each point, which maps the SO(3)-equivariant global shape descriptor of the input shape to a local shape descriptor. These local shape descriptors are provided as inputs to our decoder to facilitate point cloud self- and cross-reconstruction. Our proposed self-supervised training pipeline encourages semantically corresponding points from different shapes to be mapped to similar local shape descriptors, enabling RIST to establish dense point-wise correspondences. RIST demonstrates state-of-the-art performances on 3D part label transfer and semantic keypoint transfer given arbitrarily rotated point cloud pairs, outperforming existing methods by significant margins.
- Shapenet: An information-rich 3d model repository. arXiv, 2015.
- Equivariant point network for 3d point cloud analysis. In CVPR, 2021.
- Learning 3d dense correspondence via canonical point autoencoder. In NeurIPS, 2021.
- Cats: Cost aggregation transformers for visual correspondence. In NeurIPS, 2021.
- Pointmixer: Mlp-mixer for point cloud understanding. In ECCV, 2022.
- Spherical cnns. In ICLR, 2018.
- Vector neurons: A general framework for so(3)-equivariant networks. In ICCV, 2021.
- Learning elementary structures for 3d shape generation and matching. In NeurIPS, 2019.
- Efficient 2d-to-3d correspondence filtering for scalable 3d object recognition. In CVPR, 2013.
- Multiway non-rigid point cloud registration via learned functional map synchronization. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(2):2038–2053, 2022a.
- Learning semantic correspondence with sparse annotations. In ECCV, 2022b.
- Shape-pose disentanglement using se (3)-equivariant vector neurons. In ECCV, 2022.
- Transformatcher: Match-to-match attention for semantic correspondence. In CVPR, 2022.
- Stable and consistent prediction of 3d characteristic orientation via invariant residual learning. In ICML, 2023.
- Efficient semantic matching with hypercolumn correlation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 139–148, 2024.
- Adam: A method for stochastic optimization. In ICLR, 2015.
- DPC: Unsupervised Deep Point Correspondence via Cross and Self Construction. In 3DV, 2021.
- A closer look at rotation-invariant deep point cloud analysis. In ICCV, 2021a.
- A rotation-invariant framework for deep point cloud analysis. IEEE TVCG, 28(12):4503–4514, 2021b.
- Feature pyramid networks for object detection. In CVPR, 2017.
- Learning implicit functions for topology-varying dense 3d shape correspondence. In NeurIPS, 2020.
- Automatic grasp planning using shape primitives. In ICRA, 2003.
- Masked autoencoders for point cloud self-supervised learning. In ECCV, 2022.
- Fast point transformer. In CVPR, 2022.
- Pointnet: Deep learning on point sets for 3d classification and segmentation. In CVPR, 2017a.
- Pointnet++: Deep hierarchical feature learning on point sets in a metric space. In NIPS, 2017b.
- Shot: Unique signatures of histograms for surface and texture description. CVIU, 125:251–264, 2014.
- Robotic grasping of novel objects. In NIPS, 2006.
- 3d-rotation-equivariant quaternion neural networks. In ECCV, 2020.
- Ken Shoemake. Uniform random rotations. In Graphics Gems III (IBM Version), pp. 124–132. Elsevier, 1992.
- Srinet: Learning strictly rotation-invariant representations for point cloud classification and segmentation. In ACM MM, 2019.
- Tensor field networks: Rotation-and translation-equivariant neural networks for 3d point clouds. arXiv, 2018.
- Unique shape context for 3d data description. In ACM workshop on 3D object retrieval, 2010.
- Probabilistic warp consistency for weakly-supervised semantic correspondences. In CVPR, 2022.
- Revisiting point cloud classification: A new benchmark dataset and classification model on real-world data. In ICCV, 2019.
- Density-aware chamfer distance as a comprehensive metric for point cloud completion. In NeurIPS, 2021.
- Triangle-net: Towards robustness in point cloud learning. In WACV, 2021.
- Foldingnet: Point cloud auto-encoder via deep grid deformation. In CVPR, 2018.
- A scalable active framework for region annotation in 3d shape collections. ACM TOG, 35(6):1–12, 2016.
- Keypointnet: A large-scale 3d keypoint dataset aggregated from numerous human annotations. In CVPR, 2020.
- 3d human mesh regression with dense correspondence. In CVPR, 2020.
- Point transformer. In ICCV, 2021.
- 3d point capsule networks. In CVPR, 2019.
- Deep implicit templates for 3d shape representation. In CVPR, 2021.
- Sc3k: Self-supervised and coherent 3d keypoints estimation from rotated, noisy, and decimated point cloud data. In ICCV, 2023.