Diff-Reg v1: Diffusion Matching Model for Registration Problem
Abstract: Establishing reliable correspondences is essential for registration tasks such as 3D and 2D3D registration. Existing methods commonly leverage geometric or semantic point features to generate potential correspondences. However, these features may face challenges such as large deformation, scale inconsistency, and ambiguous matching problems (e.g., symmetry). Additionally, many previous methods, which rely on single-pass prediction, may struggle with local minima in complex scenarios. To mitigate these challenges, we introduce a diffusion matching model for robust correspondence construction. Our approach treats correspondence estimation as a denoising diffusion process within the doubly stochastic matrix space, which gradually denoises (refines) a doubly stochastic matching matrix to the ground-truth one for high-quality correspondence estimation. It involves a forward diffusion process that gradually introduces Gaussian noise into the ground truth matching matrix and a reverse denoising process that iteratively refines the noisy matching matrix. In particular, the feature extraction from the backbone occurs only once during the inference phase. Our lightweight denoising module utilizes the same feature at each reverse sampling step. Evaluation of our method on both 3D and 2D3D registration tasks confirms its effectiveness. The code is available at https://github.com/wuqianliang/Diff-Reg.
- Y. Zhong, “Intrinsic shape signatures: A shape descriptor for 3d object recognition,” in 2009 IEEE 12th international conference on computer vision workshops, ICCV workshops, 2009.
- H. Yu, Z. Qin, J. Hou, M. Saleh, D. Li, B. Busam, and S. Ilic, “Rotation-invariant transformer for point cloud matching,” in CVPR, 2023.
- X. Bai, Z. Luo, L. Zhou, H. Fu, L. Quan, and C.-L. Tai, “D3feat: Joint learning of dense detection and description of 3d local features,” in CVPR, 2020.
- M. Li, Z. Qin, Z. Gao, R. Yi, C. Zhu, Y. Guo, and K. Xu, “2d3d-matr: 2d-3d matching transformer for detection-free registration between images and point clouds,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- H. Deng, T. Birdal, and S. Ilic, “Ppfnet: Global context aware local features for robust 3d point matching,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2018.
- Z. Qin, H. Yu, C. Wang, Y. Guo, Y. Peng, and K. Xu, “Geometric transformer for fast and robust point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Q. Wu, Y. Ding, L. Luo, C. Zhou, J. Xie, and J. Yang, “Sgfeat: Salient geometric feature for point cloud registration,” arXiv preprint arXiv:2309.06207, 2023.
- Y. Li and T. Harada, “Lepard: Learning partial point cloud matching in rigid and deformable scenes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- Z. J. Yew and G. H. Lee, “Regtr: End-to-end point cloud correspondences with transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 6677–6686.
- G. Mei, H. Tang, X. Huang, W. Wang, J. Liu, J. Zhang, L. Van Gool, and Q. Wu, “Unsupervised deep probabilistic approach for partial point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- S. Huang, Z. Gojcic, M. Usvyatsov, A. Wieser, and K. Schindler, “Predator: Registration of 3d point clouds with low overlap,” in Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, 2021.
- H. Yu, F. Li, M. Saleh, B. Busam, and S. Ilic, “Cofinet: Reliable coarse-to-fine correspondences for robust pointcloud registration,” Advances in Neural Information Processing Systems, 2021.
- R. Yao, S. Du, W. Cui, A. Ye, F. Wen, H. Zhang, Z. Tian, and Y. Gao, “Hunter: Exploring high-order consistency for point cloud registration with severe outliers,” IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- H. Thomas, C. R. Qi, J.-E. Deschaud, B. Marcotegui, F. Goulette, and L. J. Guibas, “Kpconv: Flexible and deformable convolution for point clouds,” in Proceedings of the IEEE/CVF international conference on computer vision, 2019.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016.
- H. Yang, J. Shi, and L. Carlone, “Teaser: Fast and certifiable point cloud registration,” IEEE Transactions on Robotics, vol. 37, no. 2, pp. 314–333, 2020.
- X. Bai, Z. Luo, L. Zhou, H. Chen, L. Li, Z. Hu, H. Fu, and C.-L. Tai, “Pointdsc: Robust point cloud registration using deep spatial consistency,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021.
- Z. Chen, K. Sun, F. Yang, and W. Tao, “Sc2-pcr: A second order spatial compatibility for efficient and robust point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022.
- H. Jiang, Z. Dang, Z. Wei, J. Xie, J. Yang, and M. Salzmann, “Robust outlier rejection for 3d registration with variational bayes,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- X. Zhang, J. Yang, S. Zhang, and Y. Zhang, “3d registration with maximal cliques,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- C. R. Qi, O. Litany, K. He, and L. J. Guibas, “Deep hough voting for 3d object detection in point clouds,” in proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
- K. Fu, S. Liu, X. Luo, and M. Wang, “Robust point cloud registration framework based on deep graph matching,” in CVPR, 2021.
- J. Ho, A. Jain, and P. Abbeel, “Denoising diffusion probabilistic models,” Advances in neural information processing systems, 2020.
- J. Song, C. Meng, and S. Ermon, “Denoising diffusion implicit models,” arXiv preprint arXiv:2010.02502, 2020.
- C. Vignac, I. Krawczuk, A. Siraudin, B. Wang, V. Cevher, and P. Frossard, “Digress: Discrete denoising diffusion for graph generation,” arXiv preprint arXiv:2209.14734, 2022.
- J. Austin, D. D. Johnson, J. Ho, D. Tarlow, and R. Van Den Berg, “Structured denoising diffusion models in discrete state-spaces,” Advances in Neural Information Processing Systems, vol. 34, pp. 17 981–17 993, 2021.
- R. M. Caron, X. Li, P. Mikusiński, H. Sherwood, and M. D. Taylor, “Nonsquare ”doubly stochastic” matrices,” Lecture Notes-Monograph Series, vol. 28, pp. 65–75, 1996. [Online]. Available: http://www.jstor.org/stable/4355884
- Q. Wu, Y. Shen, H. Jiang, G. Mei, Y. Ding, L. Luo, J. Xie, and J. Yang, “Graph matching optimization network for point cloud registration,” in 2023 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
- G. Mei, X. Huang, L. Yu, J. Zhang, and M. Bennamoun, “Cotreg: Coupled optimal transport based point cloud registration,” arXiv preprint arXiv:2112.14381, 2021.
- Y. Li and T. Harada, “Non-rigid point cloud registration with neural deformation pyramid,” Advances in Neural Information Processing Systems, 2022.
- A. Zeng, S. Song, M. Nießner, M. Fisher, J. Xiao, and T. Funkhouser, “3dmatch: Learning local geometric descriptors from rgb-d reconstructions,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 1802–1811.
- H. Yu, J. Hou, Z. Qin, M. Saleh, I. Shugurov, K. Wang, B. Busam, and S. Ilic, “Riga: Rotation-invariant and globally-aware descriptors for point cloud registration,” arXiv preprint arXiv:2209.13252, 2022.
- J. Yu, L. Ren, Y. Zhang, W. Zhou, L. Lin, and G. Dai, “Peal: Prior-embedded explicit attention learning for low-overlap point cloud registration,” in CVPR, 2023.
- H. Deng, T. Birdal, and S. Ilic, “Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors,” in Proceedings of the European Conference on Computer Vision (ECCV), 2018.
- Z. Chen, Y. Ren, T. Zhang, Z. Dang, W. Tao, S. Süsstrunk, and M. Salzmann, “Diffusionpcr: Diffusion models for robust multi-step point cloud registration,” arXiv preprint arXiv:2312.03053, 2023.
- J. Li and G. H. Lee, “Deepi2p: Image-to-point cloud registration via deep classification,” 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2021.
- B. Wang, C. Chen, Z. Cui, J. Qin, C. X. Lu, Z. Yu, P. Zhao, Z. Dong, F. Zhu, N. Trigoni, and A. Markham, “P2-net: Joint description and detection of local features for pixel and point matching,” 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021.
- H. Wang, Y. Liu, B. Wang, Y. Sun, Z. Dong, W. Wang, and B. Yang, “Freereg: Image-to-point cloud registration leveraging pretrained diffusion models and monocular depth estimators,” ArXiv, 2023.
- M. Oquab, T. Darcet, T. Moutakanni, H. Vo, M. Szafraniec, V. Khalidov, P. Fernandez, D. Haziza, F. Massa, A. El-Nouby et al., “Dinov2: Learning robust visual features without supervision,” arXiv preprint arXiv:2304.07193, 2023.
- Y. Song and S. Ermon, “Generative modeling by estimating gradients of the data distribution,” Advances in neural information processing systems, 2019.
- J. Gong, L. G. Foo, Z. Fan, Q. Ke, H. Rahmani, and J. Liu, “Diffpose: Toward more reliable 3d pose estimation,” 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
- W. Shan, Z. Liu, X. Zhang, Z. Wang, K. Han, S. Wang, S. Ma, and W. Gao, “Diffusion-based 3d human pose estimation with multi-hypothesis aggregation,” 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- J. Wang, C. Rupprecht, and D. Novotný, “Posediffusion: Solving pose estimation via diffusion-aided bundle adjustment,” 2023 IEEE/CVF International Conference on Computer Vision (ICCV), 2023.
- S. Chen, P. Sun, Y. Song, and P. Luo, “Diffusiondet: Diffusion model for object detection,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2023.
- D. Baranchuk, I. Rubachev, A. Voynov, V. Khrulkov, and A. Babenko, “Label-efficient semantic segmentation with diffusion models,” ArXiv, 2021.
- Z. Gu, H. Chen, Z. Xu, J. Lan, C. Meng, and W. Wang, “Diffusioninst: Diffusion model for instance segmentation,” ArXiv, 2022.
- G. Parisi, “Correlation functions and computer simulations,” Nuclear Physics B, 1981.
- J. Urain, N. Funk, J. Peters, and G. Chalvatzaki, “Se (3)-diffusionfields: Learning smooth cost functions for joint grasp and motion optimization through diffusion,” in 2023 IEEE International Conference on Robotics and Automation (ICRA), 2023.
- H. Jiang, M. Salzmann, Z. Dang, J. Xie, and J. Yang, “Se (3) diffusion model-based point cloud registration for robust 6d object pose estimation,” arXiv preprint arXiv:2310.17359, 2023.
- M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in neural information processing systems, 2013.
- T.-Y. Lin, P. Dollár, R. B. Girshick, K. He, B. Hariharan, and S. J. Belongie, “Feature pyramid networks for object detection,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016.
- L. Yang, B. Kang, Z. Huang, X. Xu, J. Feng, and H. Zhao, “Depth anything: Unleashing the power of large-scale unlabeled data,” arXiv preprint arXiv:2401.10891, 2024.
- P. J. Besl and N. D. McKay, “Method for registration of 3-d shapes,” in Sensor fusion IV: control paradigms and data structures, 1992.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, 2017.
- S. Li, C. Xu, and M. Xie, “A robust o (n) solution to the perspective-n-point problem,” IEEE transactions on pattern analysis and machine intelligence, 2012.
- K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on pattern analysis and machine intelligence, 1987.
- R. W. Sumner, J. Schmid, and M. Pauly, “Embedded deformation for shape manipulation,” in ACM siggraph 2007, 2007.
- T. Igarashi, T. Moscovich, and J. F. Hughes, “As-rigid-as-possible shape manipulation,” ACM transactions on Graphics (TOG), 2005.
- Z. Qin, H. Yu, C. Wang, Y. Peng, and K. Xu, “Deep graph-based spatial consistency for robust non-rigid point cloud registration,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2023.
- B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “Nerf: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, 2021.
- Y. Li, H. Takehara, T. Taketomi, B. Zheng, and M. Nießner, “4dcomplete: Non-rigid motion estimation beyond the observable surface,” in ICCV, 2021.
- W. Wu, Z. Wang, Z. Li, W. Liu, and L. Fuxin, “Pointpwc-net: A coarse-to-fine network for supervised and self-supervised scene flow estimation on 3d point clouds,” arXiv preprint arXiv:1911.12408, 2019.
- G. Puy, A. Boulch, and R. Marlet, “Flot: Scene flow on point clouds guided by optimal transport,” in European conference on computer vision, 2020.
- X. Li, J. Kaesemodel Pontes, and S. Lucey, “Neural scene flow prior,” Advances in Neural Information Processing Systems, 2021.
- C. Choy, J. Park, and V. Koltun, “Fully convolutional geometric features,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019.
- J. Lee, M. Cho, and K. M. Lee, “Hyper-graph matching via reweighted random walks,” in CVPR, 2011.
- V. Lepetit, F. Moreno-Noguer, and P. Fua, “Ep n p: An accurate o (n) solution to the p n p problem,” International journal of computer vision, 2009.
- T.-Y. Lin, P. Dollár, R. Girshick, K. He, B. Hariharan, and S. Belongie, “Feature pyramid networks for object detection,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017.
- H. Chen, P. Wang, F. Wang, W. Tian, L. Xiong, and H. Li, “Epro-pnp: Generalized end-to-end probabilistic perspective-n-points for monocular object pose estimation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 2781–2790.
- K. Lai, L. Bo, and D. Fox, “Unsupervised feature learning for 3d scene labeling,” in IEEE International Conference on Robotics and Automation (ICRA), 2014.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.