Multiway Point Cloud Mosaicking with Diffusion and Global Optimization (2404.00429v1)
Abstract: We introduce a novel framework for multiway point cloud mosaicking (named Wednesday), designed to co-align sets of partially overlapping point clouds -- typically obtained from 3D scanners or moving RGB-D cameras -- into a unified coordinate system. At the core of our approach is ODIN, a learned pairwise registration algorithm that iteratively identifies overlaps and refines attention scores, employing a diffusion-based process for denoising pairwise correlation matrices to enhance matching accuracy. Further steps include constructing a pose graph from all point clouds, performing rotation averaging, a novel robust algorithm for re-estimating translations optimally in terms of consensus maximization and translation optimization. Finally, the point cloud rotations and positions are optimized jointly by a diffusion-based approach. Tested on four diverse, large-scale datasets, our method achieves state-of-the-art pairwise and multiway registration results by a large margin on all benchmarks. Our code and models are available at https://github.com/jinsz/Multiway-Point-Cloud-Mosaicking-with-Diffusion-and-Global-Optimization.
- 4-points congruent sets for robust pairwise surface registration. In ACM SIGGRAPH 2008 papers, pages 1–10. 2008.
- Global motion estimation from point matches. In 2012 Second international conference on 3D imaging, modeling, processing, visualization & transmission, pages 81–88. IEEE, 2012.
- Robust absolute rotation estimation via low-rank and sparse matrix decomposition. In 2014 2nd International Conference on 3D Vision, pages 491–498. IEEE, 2014.
- Spectral synchronization of multiple views in se (3). SIAM Journal on Imaging Sciences, 9(4):1963–1990, 2016.
- Diffusion autoencoders: Toward a meaningful and decodable representation. In Proceedings of the International Conference on Learning Representations, pages 123–134. ICLR, 2023.
- D3feat: Joint learning of dense detection and description of 3d local features. In CVPR, 2020.
- Graph-cut ransac. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 6733–6741, 2018.
- Image stitching with locally shared rotation axis. In 2021 International Conference on 3D Vision (3DV), pages 1382–1391. IEEE, 2021.
- A solution for multi-alignment by transformation synchronisation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2161–2169, 2015.
- Method for registration of 3-d shapes. In Sensor fusion IV: control paradigms and data structures, pages 586–606. Spie, 1992.
- Efficient and robust registration on the 3d special euclidean group. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5885–5894, 2019.
- Point pair features based object detection and pose estimation revisited. In 2015 International conference on 3D vision, pages 527–535. IEEE, 2015.
- Cad priors for accurate and flexible instance reconstruction. In Proceedings of the IEEE international conference on computer vision, pages 133–142, 2017.
- Probabilistic permutation synchronization using the riemannian structure of the birkhoff polytope. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11105–11116, 2019.
- Bayesian pose graph optimization via bingham distributions and tempered geodesic mcmc. Advances in neural information processing systems, 31, 2018.
- Guaranteed outlier removal for point cloud registration with correspondences. IEEE transactions on pattern analysis and machine intelligence, 40(12):2868–2882, 2017.
- Efficient and robust large-scale rotation averaging. In Proceedings of the IEEE International Conference on Computer Vision, pages 521–528, 2013.
- Deepmapping2: Self-supervised large-scale lidar map optimization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9306–9316, 2023.
- The trimmed iterative closest point algorithm. In 2002 International Conference on Pattern Recognition, pages 545–548. IEEE, 2002.
- Robust reconstruction of indoor scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5556–5565, 2015.
- Fully convolutional geometric features. In ICCV, 2019.
- Global structure-from-motion by similarity averaging. In Proceedings of the IEEE International Conference on Computer Vision, pages 864–872, 2015.
- Ppf-foldnet: Unsupervised learning of rotation invariant 3d local descriptors. In Proceedings of the European conference on computer vision, pages 602–618, 2018a.
- Ppfnet: Global context aware local features for robust 3d point matching. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 195–205, 2018b.
- Diffusion models beat gans on image synthesis. Advances in neural information processing systems, 34:8780–8794, 2021.
- Model globally, match locally: Efficient and robust 3d object recognition. In 2010 IEEE computer society conference on computer vision and pattern recognition, pages 998–1005. Ieee, 2010.
- 3-d mapping with an rgb-d camera. IEEE transactions on robotics, 30(1):177–187, 2013.
- Accurate and automatic alignment of range surfaces. In 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission, pages 73–80. IEEE, 2012.
- Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
- Thrift: Local 3d structure recognition. In Biennial Conference of the Australian Pattern Recognition Society on Digital Image Computing Techniques and Applications, pages 182–188. IEEE, 2007.
- Are we ready for autonomous driving? the kitti vision benchmark suite. In 2012 IEEE conference on computer vision and pattern recognition, pages 3354–3361. IEEE, 2012.
- The perfect match: 3d point cloud matching with smoothed densities. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 5545–5554, 2019.
- Learning multiview 3d point cloud registration. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1759–1769, 2020.
- Venu Madhav Govindu. Lie-algebraic averaging for globally consistent motion estimation. In Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004., pages I–I. IEEE, 2004.
- L1 rotation averaging using the weiszfeld algorithm. In CVPR 2011, pages 3041–3048. IEEE, 2011.
- Going further with point pair features. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part III 14, pages 834–848. Springer, 2016.
- Predator: Registration of 3d point clouds with low overlap. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition, pages 4267–4276, 2021.
- Translation synchronization via truncated least squares. Advances in neural information processing systems, 30, 2017.
- Learning transformation synchronization. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8082–8091, 2019.
- Fully automatic registration of multiple 3d data sets. Image and Vision Computing, 21(7):637–650, 2003.
- Conceptfusion: Open-set multimodal 3d mapping. 2023.
- Q-REG: End-to-end trainable point cloud registration with surface curvature. arXiv preprint arXiv:2309.16023, 2023.
- Using spin images for efficient object recognition in cluttered 3d scenes. IEEE Transactions on pattern analysis and machine intelligence, 21(5):433–449, 1999.
- Learning compact geometric features. In Proceedings of the IEEE international conference on computer vision, pages 153–161, 2017.
- The 3d-3d registration problem revisited. In 2007 IEEE 11th international conference on computer vision, pages 1–8. IEEE, 2007.
- Pogo-net: pose graph optimization with graph neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 5895–5905, 2021.
- Deepvcp: An end-to-end deep neural network for point cloud registration. In Proceedings of the IEEE/CVF international conference on computer vision, pages 12–21, 2019a.
- L3-net: Towards learning based lidar localization for autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6389–6398, 2019b.
- Practical and efficient multi-view matching. In Proceedings of the IEEE International Conference on Computer Vision, pages 4568–4576, 2017.
- Super 4pcs fast global pointcloud registration via smart indexing. In Computer graphics forum, pages 205–215. Wiley Online Library, 2014.
- Three-dimensional model-based object recognition and segmentation in cluttered scenes. IEEE transactions on pattern analysis and machine intelligence, 28(10):1584–1601, 2006.
- Locus: A multi-sensor lidar-centric solution for high-precision odometry and 3d mapping in real-time. IEEE Robotics and Automation Letters, 6(2):421–428, 2020.
- Neurora: Neural robust rotation averaging. In European Conference on Computer Vision, pages 137–154. Springer, 2020.
- Geometric transformer for fast and robust point cloud registration. In CVPR, 2022.
- Geotransformer: Fast and robust point cloud registration with geometric transformer. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023.
- An integrated approach for modelling and global registration of point clouds. ISPRS journal of Photogrammetry and Remote Sensing, 61(6):355–370, 2007.
- Fast point feature histograms (FPFH) for 3D registration. In ICRA, 2009a.
- Fast point feature histograms (fpfh) for 3d registration. In IEEE international conference on robotics and automation, pages 3212–3217. IEEE, 2009b.
- Lamar: Benchmarking localization and mapping for augmented reality. In European Conference on Computer Vision, pages 686–704. Springer, 2022.
- Large-scale multi-resolution surface reconstruction from rgb-d sequences. In Proceedings of the IEEE International Conference on Computer Vision, pages 3264–3271, 2013.
- Nothing stands still: A spatiotemporal benchmark on 3d point cloud registration under large geometric and temporal change, 2023.
- Point-plane slam for hand-held 3d sensors. In 2013 IEEE international conference on robotics and automation, pages 5182–5189. IEEE, 2013.
- OpenMask3D: Open-Vocabulary 3D Instance Segmentation. In Conference on Neural Information Processing Systems (NeurIPS), 2023.
- Rotation synchronization via deep matrix factorization. 2023.
- Keypoint-based 4-points congruent sets–automated marker-less registration of laser scans. ISPRS journal of photogrammetry and remote sensing, 96:149–163, 2014.
- Globally consistent registration of terrestrial laser scans via graph optimization. ISPRS journal of photogrammetry and remote sensing, 109:126–138, 2015.
- Kpconv: Flexible and deformable convolution for point clouds. In Proceedings of the IEEE/CVF international conference on computer vision, pages 6411–6420, 2019.
- Unique signatures of histograms for local surface description. In European Conference on Computer Vision, pages 356–369. Springer, 2010.
- Multiview registration via graph diffusion of dual quaternions. In CVPR 2011, pages 2441–2448. IEEE, 2011.
- Robust multiview point cloud registration with reliable pose graph initialization and history reweighting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9506–9515, 2023a.
- Posediffusion: Solving pose estimation via diffusion-aided bundle adjustment. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 9773–9783, 2023b.
- Deep closest point: Learning representations for point cloud registration. In Proceedings of the IEEE/CVF international conference on computer vision, pages 3523–3532, 2019.
- Deformation-based loop closure for large scale dense rgb-d slam. In 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 548–555. IEEE, 2013.
- Visual localization within lidar maps for automated urban driving. In 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 176–183. IEEE, 2014.
- Sun3d: A database of big spaces reconstructed using sfm and object labels. In Proceedings of the IEEE international conference on computer vision, pages 1625–1632, 2013.
- Go-icp: A globally optimal solution to 3d icp point-set registration. IEEE transactions on pattern analysis and machine intelligence, 38(11):2241–2254, 2015.
- End-to-end rotation averaging with multi-source propagation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11774–11783, 2021.
- 3dfeat-net: Weakly supervised local 3d features for point cloud registration. In Proceedings of the European conference on computer vision, pages 607–623, 2018.
- Learning iterative robust transformation synchronization. In 2021 International Conference on 3D Vision (3DV), pages 1206–1215. IEEE, 2021.
- Regtr: End-to-end point cloud correspondences with transformers. In CVPR, 2022.
- Rotation-invariant transformer for point cloud matching. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5384–5393, 2023a.
- Peal: Prior-embedded explicit attention learning for low-overlap point cloud registration. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 17702–17711, 2023b.
- Automatic registration of rgb-d scans via salient directions. In Proceedings of the IEEE international conference on computer vision, pages 2808–2815, 2013.
- 3dmatch: Learning local geometric descriptors from rgb-d reconstructions. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1802–1811, 2017.
- Dense scene reconstruction with points of interest. ACM Transactions on Graphics (ToG), 32(4):1–8, 2013.
- Fast global registration. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 766–782. Springer, 2016.