Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

3DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration (2403.11577v2)

Published 18 Mar 2024 in cs.CV and cs.RO

Abstract: Reliable multimodal sensor fusion algorithms require accurate spatiotemporal calibration. Recently, targetless calibration techniques based on implicit neural representations have proven to provide precise and robust results. Nevertheless, such methods are inherently slow to train given the high computational overhead caused by the large number of sampled points required for volume rendering. With the recent introduction of 3D Gaussian Splatting as a faster alternative to implicit representation methods, we propose to leverage this new rendering approach to achieve faster multi-sensor calibration. We introduce 3DGS-Calib, a new calibration method that relies on the speed and rendering accuracy of 3D Gaussian Splatting to achieve multimodal spatiotemporal calibration that is accurate, robust, and with a substantial speed-up compared to methods relying on implicit neural representations. We demonstrate the superiority of our proposal with experimental results on sequences from KITTI-360, a widely used driving dataset.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (34)
  1. A. Geiger, F. Moosmann, Ö. Car, and B. Schuster, “Automatic camera and range sensor calibration using a single shot,” in IEEE international conference on robotics and automation (RA-L), 2012, pp. 3936–3943.
  2. C. Guindel, J. Beltrán, D. Martín, and F. García, “Automatic extrinsic calibration for lidar-stereo vehicle sensor setups,” in IEEE international conference on intelligent transportation systems (ITSC), 2017, pp. 1–6.
  3. Z. Pusztai and L. Hajder, “Accurate calibration of LiDAR-camera systems using ordinary boxes,” in IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), 2017, pp. 394–402.
  4. A. Napier, P. Corke, and P. Newman, “Cross-calibration of push-broom 2d lidars and cameras in natural scenes,” in IEEE International Conference on Robotics and Automation (ICRA), 2013, pp. 3679–3684.
  5. C. Yuan, X. Liu, X. Hong, and F. Zhang, “Pixel-level extrinsic self calibration of high resolution lidar and camera in targetless environments,” IEEE Robotics and Automation Letters (RA-L), vol. 6, no. 4, pp. 7517–7524, 2021.
  6. Z. Taylor and J. Nieto, “A mutual information approach to automatic calibration of camera and lidar in natural environments,” in Australian Conference on Robotics and Automation, 2012, pp. 3–5.
  7. G. Pandey, J. McBride, S. Savarese, and R. Eustice, “Automatic targetless extrinsic calibration of a 3d lidar and camera by maximizing mutual information,” in AAAI Conference on Artificial Intelligence (AAAI), 2012.
  8. G. Iyer, R. K. Ram, J. K. Murthy, and K. M. Krishna, “CalibNet: Geometrically supervised extrinsic calibration using 3d spatial transformer networks,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 1110–1117.
  9. X. Lv, B. Wang, Z. Dou, D. Ye, and S. Wang, “LCCNet: LiDAR and camera self-calibration using cost volume network,” in CVPRW, 2021, pp. 2894–2901.
  10. S. Zhou, S. Xie, R. Ishikawa, K. Sakurada, M. Onishi, and T. Oishi, “INF: Implicit Neural Fusion for LiDAR and Camera,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023.
  11. Q. Herau, N. Piasco, M. Bennehar, L. Roldão, D. Tsishkou, C. Migniot, P. Vasseur, and C. Demonceaux, “MOISST: Multimodal Optimization of Implicit Scene for SpatioTemporal calibration,” in IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2023, pp. 1810–1817.
  12. Q. Herau, N. Piasco, M. Bennehar, L. Roldão, D. Tsishkou, C. Migniot, P. Vasseur, and C. Demonceaux, “SOAC: Spatio-Temporal Overlap-Aware Multi-Sensor Calibration using Neural Radiance Fields,” arXiv preprint arXiv:2311.15803, 2023.
  13. B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3d gaussian splatting for real-time radiance field rendering,” ACM Transactions on Graphics (ToG), vol. 42, no. 4, 2023.
  14. A. Moreau, J. Song, H. Dhamo, R. Shaw, Y. Zhou, and E. Pérez-Pellitero, “Human gaussian splatting: Real-time rendering of animatable avatars,” 2024.
  15. H. Dahmani, M. Bennehar, N. Piasco, L. Roldal, and D. Tsishkou, “Swag: Splatting in the wild images with appearance-conditioned gaussians,” arXiv preprint arXiv, 2024.
  16. X. Zhou, Z. Lin, X. Shan, Y. Wang, D. Sun, and M.-H. Yang, “Drivinggaussian: Composite gaussian splatting for surrounding dynamic autonomous driving scenes,” arXiv preprint arXiv:2312.07920, 2023.
  17. Y. Liao, J. Xie, and A. Geiger, “KITTI-360: A novel dataset and benchmarks for urban scene understanding in 2d and 3d,” IEEE TPAMI, vol. 45, no. 3, pp. 3292–3310, 2022.
  18. B. Mildenhall, P. P. Srinivasan, M. Tancik, J. T. Barron, R. Ramamoorthi, and R. Ng, “NeRF: Representing scenes as neural radiance fields for view synthesis,” Communications of the ACM, vol. 65, no. 1, pp. 99–106, 2021.
  19. J. T. Barron, B. Mildenhall, D. Verbin, P. P. Srinivasan, and P. Hedman, “Mip-NeRF 360: Unbounded anti-aliased neural radiance fields,” in CVPR, 2022, pp. 5470–5479.
  20. T. Müller, A. Evans, C. Schied, and A. Keller, “Instant neural graphics primitives with a multiresolution hash encoding,” ACM TOG, vol. 41, no. 4, pp. 1–15, 2022.
  21. A. Moreau, N. Piasco, M. Bennehar, D. Tsishkou, B. Stanciulescu, and A. de La Fortelle, “CROSSFIRE: Camera Relocalization On Self-Supervised Features from an Implicit Representation,” ICCV, 2023.
  22. P. Wang, L. Liu, Y. Liu, C. Theobalt, T. Komura, and W. Wang, “Neus: Learning neural implicit surfaces by volume rendering for multi-view reconstruction,” NeurIPS, 2021.
  23. Z. Wang, S. Wu, W. Xie, M. Chen, and V. A. Prisacariu, “Nerf–: Neural radiance fields without known camera parameters,” arXiv preprint arXiv:2102.07064, 2021.
  24. C.-H. Lin, W.-C. Ma, A. Torralba, and S. Lucey, “Barf: Bundle-adjusting neural radiance fields,” in ICCV, 2021.
  25. C. Yan, D. Qu, D. Wang, D. Xu, Z. Wang, B. Zhao, and X. Li, “Gs-slam: Dense visual slam with 3d gaussian splatting,” arXiv preprint arXiv:2311.11700, 2023.
  26. H. Matsuki, R. Murai, P. H. Kelly, and A. J. Davison, “Gaussian splatting slam,” arXiv preprint arXiv:2312.06741, 2023.
  27. N. Keetha, J. Karhade, K. M. Jatavallabhula, G. Yang, S. Scherer, D. Ramanan, and J. Luiten, “Splatam: Splat, track & map 3d gaussians for dense rgb-d slam,” arXiv preprint arXiv:2312.02126, 2023.
  28. Y. Fu, S. Liu, A. Kulkarni, J. Kautz, A. A. Efros, and X. Wang, “Colmap-free 3d gaussian splatting,” arXiv preprint arXiv:2312.07504, 2023.
  29. T. Shan and B. Englot, “Lego-loam: Lightweight and ground-optimized lidar odometry and mapping on variable terrain,” in 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 2018, pp. 4758–4765.
  30. K. S. Arun, T. S. Huang, and S. D. Blostein, “Least-squares fitting of two 3-d point sets,” IEEE Transactions on pattern analysis and machine intelligence, no. 5, pp. 698–700, 1987.
  31. I. Vizzo, T. Guadagnino, B. Mersch, L. Wiesmann, J. Behley, and C. Stachniss, “Kiss-icp: In defense of point-to-point icp–simple, accurate, and robust registration if done the right way,” IEEE Robotics and Automation Letters (RA-L), vol. 8, no. 2, pp. 1029–1036, 2023.
  32. K. Shoemake, “Animating rotation with quaternion curves,” in Proceedings of the 12th annual conference on Computer graphics and interactive techniques, 1985, pp. 245–254.
  33. M. Zwicker, H. Pfister, J. Van Baar, and M. Gross, “Ewa volume splatting,” in Proceedings Visualization, 2001. VIS’01.   IEEE, 2001, pp. 29–538.
  34. A. Geiger, P. Lenz, C. Stiller, and R. Urtasun, “Vision meets robotics: The kitti dataset,” The International Journal of Robotics Research, vol. 32, no. 11, pp. 1231–1237, 2013.
Citations (4)

Summary

  • The paper introduces a novel calibration method using 3D Gaussian Splatting to enhance multimodal sensor fusion.
  • It demonstrates that the proposed approach significantly outperforms NeRF-based techniques in both speed and accuracy on the KITTI-360 dataset.
  • The method enables real-time sensor calibration for autonomous systems without reliance on cumbersome, target-based procedures.

Overview of 3DGS-Calib: 3D Gaussian Splatting for Multimodal SpatioTemporal Calibration

The paper presents a novel approach to multimodal spatiotemporal calibration using a method termed 3DGS-Calib. This method addresses the fundamental challenge of sensor fusion in robotics, where accurate spatiotemporal alignment between multimodal sensors such as LiDAR and RGB cameras is crucial for effective data integration and scene understanding. Accurate calibration is imperative for tasks such as localization, mapping, and object detection common in autonomous systems.

Proposed Methodology

Traditional methods of calibration often rely on target-based strategies, which involve physical targets and manual data collection processes that are cumbersome and not conducive to open-world applications. Alternatively, neural implicit representation methods like Neural Radiance Fields (NeRF) have gained traction due to their ability to perform targetless calibrations. These methods, though accurate, are computationally intensive and exhibit longer training periods, which might hinder practicality in real-time or on-the-fly scenarios.

The key innovation explored in this paper is the use of 3D Gaussian Splatting (3DGS) as an alternative rendering approach. 3DGS offers several advantages over traditional NeRF-based techniques by significantly reducing training time while maintaining high levels of accuracy in calibration. This is achieved by using an explicit representation of the scene with 3D Gaussians that allow for rapid convergence without sacrificing detail.

Experimental Results

The authors validate the effectiveness of their proposed method through experiments on the KITTI-360 dataset, showing that 3DGS-Calib surpasses existing NeRF-based approaches in both speed and accuracy. Notably, the system achieves robust calibration without the need for scene-specific features or additional supervision, which were often necessary in prior models. The results demonstrate the potential for 3DGS-Calib to deliver superior temporal and spatial alignment, thus facilitating efficient sensor fusion.

Implications and Future Prospects

The introduction of 3D Gaussian Splatting into the domain of multimodal sensor calibration carries substantial implications for real-time systems where computational resources and time constraints are critical. This advancement sets the stage for the deployment of advanced autonomous systems that can rapidly adapt to new environments without extensive pre-calibration, rendering them effective for dynamic and unpredictable applications.

Looking forward, the potential to extend this methodology without the limiting assumption of LiDAR points being concentrated in lower environmental structures opens opportunities for more diverse sensor configurations and applications beyond traditional urban driving scenes. Additionally, with further refinement and integration, the principles laid out in this paper could inspire new directions in AI-driven spatial analytics and sensor technology advancements.

3DGS-Calib stands as a promising step in the evolution of sensor calibration, promising efficiency and precision that could enable the next generation of intelligent systems. Its balance between computational feasibility and calibration accuracy highlights the potential of explicit scene representations in modern robotics and perception tasks.

Youtube Logo Streamline Icon: https://streamlinehq.com