Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 134 tok/s
Gemini 2.5 Pro 41 tok/s Pro
GPT-5 Medium 24 tok/s Pro
GPT-5 High 26 tok/s Pro
GPT-4o 92 tok/s Pro
Kimi K2 193 tok/s Pro
GPT OSS 120B 439 tok/s Pro
Claude Sonnet 4.5 37 tok/s Pro
2000 character limit reached

RKHS-BA: A Robust Correspondence-Free Multi-View Registration Framework with Semantic Point Clouds (2403.01254v2)

Published 2 Mar 2024 in cs.RO

Abstract: This work reports a novel multi-frame Bundle Adjustment (BA) framework called RKHS-BA. It uses continuous landmark representations that encode RGB-D/LiDAR and semantic observations in a Reproducing Kernel Hilbert Space (RKHS). With a correspondence-free pose graph formulation, the proposed system constructs a loss function that achieves more generalized convergence than classical point-wise convergence. We demonstrate its applications in multi-view point cloud registration, sliding-window odometry, and global LiDAR mapping on simulated and real data. It shows highly robust pose estimations in extremely noisy scenes and exhibits strong generalization with various types of semantic inputs. The open source implementation is released in https://github.com/UMich-CURLY/RKHS_BA.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (95)
  1. Convergence of iteratively re-weighted least squares to robust m-estimators. In 2015 IEEE Winter Conference on Applications of Computer Vision, pages 480–487, 2015. doi: 10.1109/WACV.2015.70.
  2. Robust map optimization using dynamic covariance scaling. In Proc. IEEE Int. Conf. Robot. and Automation, pages 62–69. Ieee, 2013.
  3. Ceres solver: Tutorial & reference. Google Inc, 2(72):8, 2012.
  4. Outlier-robust estimation: Hardness, minimally tuned algorithms, and applications. IEEE Transactions on Robotics, 38(1):281–301, 2021.
  5. Semantickitti: A dataset for semantic scene understanding of lidar sequences. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9297–9307, 2019.
  6. Christopher M. Bishop. Pattern Recognition and Machine Learning. Springer-Verlag, Berlin, Heidelberg, 2006. ISBN 0387310738.
  7. On the unification of line processes, outlier rejection, and robust statistics with applications in early vision. International journal of computer vision, 19(1):57–91, 1996.
  8. Codeslam—learning a compact, optimisable representation for dense visual slam. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2560–2568, 2018.
  9. Probabilistic data association for semantic slam. In Proc. IEEE Int. Conf. Robot. and Automation, pages 1722–1729. IEEE, 2017.
  10. Past, present, and future of simultaneous localization and mapping: Toward the robust-perception age. IEEE Trans. Robot., 32(6):1309–1332, 2016. doi: 10.1109/TRO.2016.2624754.
  11. Suma++: Efficient lidar-based semantic slam. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 4530–4537, 2019. doi: 10.1109/IROS40897.2019.8967704.
  12. Nonparametric continuous sensor registration. J. Mach. Learning Res., 22(271):1–50, 2021.
  13. Gauge equivariant convolutional networks and the icosahedral cnn. In International conference on Machine learning, pages 1321–1330. PMLR, 2019.
  14. Deepfactors: Real-time probabilistic dense monocular slam. IEEE Robotics and Automation Letters, 5(2):721–728, 2020.
  15. Bundlefusion: Real-time globally consistent 3d reconstruction using on-the-fly surface reintegration. ACM Transactions on Graphics (ToG), 36(4):1, 2017.
  16. A probabilistic framework for color-based point set registration. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 1818–1826, 2016.
  17. Monoslam: Real-time single camera slam. IEEE Trans. Pattern Anal. Mach. Intell., 29(6):1052–1067, 2007. doi: 10.1109/TPAMI.2007.1049.
  18. Andrew J Davison. Real-time simultaneous localisation and mapping with a single camera. In Proc. IEEE Int. Conf. Comput. Vis., volume 3, pages 1403–1403. IEEE Computer Society, 2003.
  19. Ct-icp: Real-time elastic lidar odometry with loop closure. In 2022 International Conference on Robotics and Automation (ICRA), pages 5580–5586. IEEE, 2022.
  20. Multimodal semantic slam with probabilistic data association. In Proc. IEEE Int. Conf. Robot. and Automation, pages 2419–2425. IEEE, 2019.
  21. Probabilistic data association via mixture models for robust semantic slam. In Proc. IEEE Int. Conf. Robot. and Automation, pages 1098–1104. IEEE, 2020.
  22. Discrete-Continuous Smoothing and Mapping. arXiv preprint arXiv:2204.11936, 2022a.
  23. Discrete-Continuous Smoothing and Mapping. IEEE Robotics and Automation Letters, 7(4):12395–12402, 2022b. doi: 10.1109/LRA.2022.3216938.
  24. Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell., 40(3):611–625, March 2018. ISSN 0162-8828. doi: 10.1109/TPAMI.2017.2658577.
  25. Lsd-slam: Large-scale direct monocular slam. In Proc. European Conf. Comput. Vis., pages 834–849. Springer, 2014.
  26. Direct sparse odometry. IEEE Trans. Pattern Anal. Mach. Intell., 40(3):611–625, 2017.
  27. Joint alignment of multiple point sets with batch and incremental expectation-maximization. IEEE Trans. Pattern Anal. Mach. Intell., 40(6):1397–1410, 2017.
  28. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6):381–395, 1981.
  29. Svo: Fast semi-direct monocular visual odometry. In Proc. IEEE Int. Conf. Robot. and Automation, pages 15–22, 2014. doi: 10.1109/ICRA.2014.6906584.
  30. A nonparametric belief solution to the bayes tree. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 2189–2196, 2016. doi: 10.1109/IROS.2016.7759343.
  31. Continuous direct sparse visual odometry from RGB-D images. In Proc. Robot.: Sci. Syst. Conf., Freiburg, Germany, June 2019.
  32. Ross Girshick. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 1440–1448, 2015.
  33. The perfect match: 3d point cloud matching with smoothed densities. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 5545–5554, 2019.
  34. Jacob Goldberger. Registration of multiple point sets using the em algorithm. In Proceedings of the Seventh IEEE International Conference on Computer Vision, volume 2, pages 730–736. IEEE, 1999.
  35. g2o: A general framework for (hyper) graph optimization. In Proc. IEEE Int. Conf. Robot. and Automation, pages 9–13, 2011.
  36. R. I. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521540518, second edition, 2004.
  37. Towards a reliable slam back-end. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 37–43. IEEE, 2013.
  38. Kinectfusion: real-time 3d reconstruction and interaction using a moving depth camera. In Proceedings of the 24th annual ACM symposium on User interface software and technology, pages 559–568, 2011.
  39. isam2: Incremental smoothing and mapping using the bayes tree. Int. J. Robot. Res., 31(2):216–235, 2012.
  40. Dense visual slam for rgb-d cameras. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 2100–2106. IEEE, 2013.
  41. Tandem: Tracking and dense mapping in real-time using deep multi-view stereo. In Conference on Robot Learning, pages 34–45. PMLR, 2022.
  42. Globally consistent 3d lidar mapping with gpu-accelerated gicp matching cost factors. IEEE Robotics and Automation Letters, 6(4):8591–8598, 2021.
  43. Large-scale lidar consistent mapping using hierarchical lidar bundle adjustment. IEEE Robotics and Automation Letters, 8(3):1523–1530, 2023. doi: 10.1109/LRA.2023.3238902.
  44. Zheng Liu and Fu Zhang. Balm: Bundle adjustment for lidar mapping. IEEE Robotics and Automation Letters, 6(2):3184–3191, 2021.
  45. David G Lowe. Distinctive image features from scale-invariant keypoints. International journal of computer vision, 60:91–110, 2004.
  46. Joint rigid registration of multiple generalized point sets with hybrid mixture models. IEEE Transactions on Automation Science and Engineering, 17(1):334–347, 2019.
  47. Voldor: Visual odometry from log-logistic dense optical flow residuals. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 4898–4909, 2020.
  48. Registration of point cloud data from a geometric optimization perspective. In Proceedings of the 2004 Eurographics/ACM SIGGRAPH symposium on Geometry processing, pages 22–31, 2004.
  49. Slam with objects using a nonparametric pose graph. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 4602–4609. IEEE, 2016.
  50. Orb-slam2: An open-source slam system for monocular, stereo, and rgb-d cameras. IEEE Trans. Robot., 33(5):1255–1262, 2017.
  51. Orb-slam: a versatile and accurate monocular slam system. IEEE Trans. Robot., 31(5):1147–1163, 2015.
  52. Dtam: Dense tracking and mapping in real-time. In 2011 international conference on computer vision, pages 2320–2327. IEEE, 2011.
  53. Inference on networks of mixtures for robust robot mapping. Int. J. Robot. Res., 32(7):826–840, 2013.
  54. Incremental abstraction in distributed probabilistic slam graphs. In Proc. IEEE Int. Conf. Robot. and Automation, 2022.
  55. Mulls: Versatile lidar slam via multi-metric linear least square. In 2021 IEEE International Conference on Robotics and Automation (ICRA), pages 11633–11640. IEEE, 2021.
  56. Advances in inference and representation for simultaneous localization and mapping. Annual Review of Control, Robotics, and Autonomous Systems, 4:215–242, 2021.
  57. Nerf-slam: Real-time dense monocular slam with neural radiance fields. arXiv preprint arXiv:2210.13641, 2022.
  58. Orb: An efficient alternative to sift or surf. In Proc. IEEE Int. Conf. Comput. Vis., pages 2564–2571. Ieee, 2011.
  59. Fast point feature histograms (fpfh) for 3d registration. In 2009 IEEE international conference on robotics and automation, pages 3212–3217. IEEE, 2009.
  60. Slam++: Simultaneous localisation and mapping at the level of objects. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 1352–1359, 2013. doi: 10.1109/CVPR.2013.178.
  61. Visual odometry [tutorial]. IEEE robotics & automation magazine, 18(4):80–92, 2011.
  62. Bad slam: Bundle adjusted direct rgb-d slam. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 134–144, 2019.
  63. Generalized-ICP. In Proc. Robot.: Sci. Syst. Conf., volume 2 Issue 4, page 435. Seattle, WA, 2009.
  64. Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 4758–4765. IEEE, 2018.
  65. Scale drift-aware large scale monocular slam. Robotics: Science and Systems VI, 2(3):7, 2010.
  66. imap: Implicit mapping and positioning in real-time. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 6229–6238, 2021.
  67. Switchable constraints for robust pose graph slam. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 1879–1884. IEEE, 2012.
  68. Ba-net: Dense bundle adjustment network. arXiv preprint arXiv:1806.04807, 2018.
  69. Cnn-slam: Real-time dense monocular slam with learned depth prediction. In Proc. IEEE Conf. Comput. Vis. Pattern Recog., pages 6243–6252, 2017.
  70. Raft: Recurrent all-pairs field transforms for optical flow. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part II 16, pages 402–419. Springer, 2020.
  71. Droid-slam: Deep visual slam for monocular, stereo, and rgb-d cameras. Advances in neural information processing systems, 34:16558–16569, 2021.
  72. Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019.
  73. Bundle adjustment—a modern synthesis. In Vision Algorithms: Theory and Practice: International Workshop on Vision Algorithms, pages 298–372. Springer, 2000.
  74. A correlation-based approach to robust point set registration. In Proc. European Conf. Comput. Vis., pages 558–569. Springer Berlin Heidelberg, 2004.
  75. Greg Turk. The stanford 3d scanning repository, 2000. URL https://graphics.stanford.edu/data/3Dscanrep/.
  76. A dense structure model for image based stereo slam. In Proc. IEEE Int. Conf. Robot. and Automation, pages 1758–1763. IEEE, 2011.
  77. Simultaneous nonrigid registration of multiple point sets and atlas construction. IEEE Trans. Pattern Anal. Mach. Intell., 30(11):2011–2022, 2008.
  78. Stereo dso: Large-scale direct sparse visual odometry with stereo cameras. In Proceedings of the IEEE International Conference on Computer Vision, pages 3903–3911, 2017.
  79. Tartanair: A dataset to push the limits of visual slam. In Proc. IEEE/RSJ Int. Conf. Intell. Robots and Syst., pages 4909–4916. IEEE, 2020.
  80. Robust real-time visual odometry for dense rgb-d mapping. In 2013 IEEE International Conference on Robotics and Automation, pages 5724–5731. IEEE, 2013.
  81. Elasticfusion: Real-time dense slam and light source estimation. Int. J. Robot. Res., 35(14):1697–1716, 2016.
  82. Direct sparse odometry with planes. IEEE Robotics and Automation Letters, pages 1–1, 2021. doi: 10.1109/LRA.2021.3130648.
  83. Jiatian Wu. Direct Sparse Odometry with Stereo Cameras, January 2023. URL https://github.com/JiatianWu/stereo-dso. original-date: 2017-02-23T07:49:13Z.
  84. Unifying flow, stereo and depth estimation. arXiv preprint arXiv:2211.05783, 2022.
  85. Teaser: Fast and certifiable point cloud registration. IEEE Transactions on Robotics, 37(2):314–333, 2020a.
  86. Deep virtual stereo odometry: Leveraging deep depth prediction for monocular direct sparse odometry. In Proceedings of the European conference on computer vision (ECCV), pages 817–833, 2018.
  87. D3vo: Deep depth, deep pose and deep uncertainty for monocular visual odometry. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1281–1292, 2020b.
  88. Christopher Zach. Robust bundle adjustment revisited. In Proc. European Conf. Comput. Vis., pages 772–787. Springer, 2014.
  89. Ji Zhang and Sanjiv Singh. Loam: Lidar odometry and mapping in real-time. In Robotics: Science and systems, volume 2, pages 1–9. Berkeley, CA, 2014.
  90. Bayesian nonparametric object association for semantic slam. IEEE Robotics and Automation Letters, 6(3):5493–5500, 2021.
  91. A new framework for registration of semantic point clouds from stereo and rgb-d cameras. Proc. IEEE Int. Conf. Robot. and Automation, pages 12214–12221, 2020.
  92. Point-plane slam using supposed planes for indoor environments. Sensors, 19(17), 2019. ISSN 1424-8220. doi: 10.3390/s19173795.
  93. Fast global registration. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part II 14, pages 766–782. Springer, 2016.
  94. E2pn: Efficient se (3)-equivariant point network. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1223–1232, 2023.
  95. Direct sparse mapping. IEEE Trans. Robot., 36(4):1363–1370, 2020.
Citations (2)

Summary

  • The paper introduces a correspondence-free framework that optimizes multi-view registration by integrating semantic labels and leveraging RKHS representations.
  • It employs dual-sum optimization to unify geometric and semantic data, enhancing registration accuracy without relying on traditional feature correspondences.
  • Global rotation initialization using Icosahedral symmetry and an IRLS solver improves robustness and performance in texture-less and dynamic environments.

Comprehensive Analysis of RKHS-BA: A Correspondence-Free Multi-View Registration Framework with Global Tracking

Introduction

Bundle Adjustment (BA) has been a staple technique in the field of computer vision, specifically in the context of Simultaneous Localization and Mapping (SLAM) and 3D Reconstruction. Despite its extensive application and development, traditional BA methodologies predominantly rely on feature correspondences, which can be problematic in texture-less or dynamically varied environments. Addressing these shortcomings, this paper introduces a novel BA framework titled RKHS-BA (Reproducing Kernel Hilbert Space Bundle Adjustment), which is designed to be correspondence-free and incorporates both RGB-D/LiDAR and semantic labels directly into the optimization process. RKHS-BA effectively generalizes photometric loss functions commonly used in direct methods and facilitates highly robust performance in challenging scenes.

Theoretical Foundations

RKHS-BA leverages the concept of representing point cloud observations as functions in a Reproducing Kernel Hilbert Space. This representation enables the direct use of RGB-D/LiDAR and semantic labels in optimization without necessitating pre-established correspondences, which is a significant departure from traditional BA methodologies. Central to this approach is the formulation of an objective function that measures alignment through inner products in RKHS, embodying both geometric and semantic congruence without explicit correspondences.

Methodology

The RKHS-BA framework incorporates several novel components:

  1. Correspondence-Free Formulation: Unlike conventional methods that rely on strict feature correspondences, RKHS-BA employs a dual-sum optimization objective that respects both geometric and semantic associations in a continuous spatial-semantic functional representation.
  2. Global Rotation Initialization: A unique aspect of RKHS-BA is its ability to initialize the rotation globally by examining a finite set of rotations generated by the Icosahedral symmetry, thus significantly enhancing robustness to initialization and facilitating global registration.
  3. Semantically Informed IRLS Backend: The optimization process of RKHS-BA is realized through an Iteratively Reweighted Least Squares (IRLS) solver, with weights derived from both geometric and semantic similarities, introducing a level of robustness and semantic awareness uncommon in traditional BA techniques.

Empirical Evaluation

Empirical assessments of RKHS-BA were conducted across several synthetic and real-world datasets, demonstrating its superior capability in handling extremely challenging scenes, especially where traditional methods falter due to poor texture or dynamic changes. Further, RKHS-BA showcased an advantageous trade-off between generalization and accuracy when benchmarked against cutting-edge alternatives.

Implications and Future Directions

RKHS-BA not only presents a robust approach to multi-view registration in adverse conditions but also opens up new avenues for research in semantic SLAM. Its correspondence-free nature integrated with semantic awareness makes it a potent solution for future exploration in densely semantic mapping and potentially in other domains where direct and feature-based methods struggle.

Conclusively, while RKHS-BA establishes a new paradigm in bundle adjustment, the exploration into its full potential, especially in leveraging semantic hierarchies and optimizing computational efficiency, remains an exciting avenue for future work.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

This paper has been mentioned in 1 tweet and received 26 likes.

Upgrade to Pro to view all of the tweets about this paper: