RTG-SLAM: Real-time 3D Reconstruction at Scale using Gaussian Splatting (2404.19706v3)
Abstract: We present Real-time Gaussian SLAM (RTG-SLAM), a real-time 3D reconstruction system with an RGBD camera for large-scale environments using Gaussian splatting. The system features a compact Gaussian representation and a highly efficient on-the-fly Gaussian optimization scheme. We force each Gaussian to be either opaque or nearly transparent, with the opaque ones fitting the surface and dominant colors, and transparent ones fitting residual colors. By rendering depth in a different way from color rendering, we let a single opaque Gaussian well fit a local surface region without the need of multiple overlapping Gaussians, hence largely reducing the memory and computation cost. For on-the-fly Gaussian optimization, we explicitly add Gaussians for three types of pixels per frame: newly observed, with large color errors, and with large depth errors. We also categorize all Gaussians into stable and unstable ones, where the stable Gaussians are expected to well fit previously observed RGBD images and otherwise unstable. We only optimize the unstable Gaussians and only render the pixels occupied by unstable Gaussians. In this way, both the number of Gaussians to be optimized and pixels to be rendered are largely reduced, and the optimization can be done in real time. We show real-time reconstructions of a variety of large scenes. Compared with the state-of-the-art NeRF-based RGBD SLAM, our system achieves comparable high-quality reconstruction but with around twice the speed and half the memory cost, and shows superior performance in the realism of novel view synthesis and camera tracking accuracy.
- Real-Time High-Accuracy Three-Dimensional Reconstruction with Consumer RGB-D Cameras. ACM Trans. Graph. 37, 5, Article 171 (sep 2018), 16 pages. https://doi.org/10.1145/3182157
- Scalable real-time volumetric surface reconstruction. ACM Trans. Graph. 32, 4 (2013), 113:1–113:16. https://doi.org/10.1145/2461912.2461940
- Depth-Regularized Optimization for 3D Gaussian Splatting in Few-Shot Images. arXiv:2311.13398 [cs.CV]
- BundleFusion: Real-time Globally Consistent 3D Reconstruction using On-the-fly Surface Re-integration. ACM Transactions on Graphics 2017 (TOG) (2017).
- Interactive 3D modeling of indoor environments with a consumer depth camera. In UbiComp 2011: Ubiquitous Computing, 13th International Conference, UbiComp 2011, Beijing, China, September 17-21, 2011, Proceedings, James A. Landay, Yuanchun Shi, Donald J. Patterson, Yvonne Rogers, and Xing Xie (Eds.). ACM, 75–84. https://doi.org/10.1145/2030112.2030123
- Photo-SLAM: Real-time Simultaneous Localization and Photorealistic Mapping for Monocular, Stereo, and RGB-D Cameras. CoRR abs/2311.16728 (2023). https://doi.org/10.48550/ARXIV.2311.16728 arXiv:2311.16728
- DI-Fusion: Online Implicit 3D Reconstruction with Deep Priors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- Real-Time Globally Consistent 3D Reconstruction With Semantic Priors. IEEE Transactions on Visualization and Computer Graphics 29, 4 (2023), 1977–1991. https://doi.org/10.1109/TVCG.2021.3137912
- ESLAM: Efficient Dense SLAM System Based on Hybrid Representation of Signed Distance Fields. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. IEEE, 17408–17419. https://doi.org/10.1109/CVPR52729.2023.01670
- SplaTAM: Splat, Track & Map 3D Gaussians for Dense RGB-D SLAM. CoRR abs/2312.02126 (2023). https://doi.org/10.48550/ARXIV.2312.02126 arXiv:2312.02126
- Real-Time 3D Reconstruction in Dynamic Scenes Using Point-Based Fusion. In 2013 International Conference on 3D Vision, 3DV 2013, Seattle, Washington, USA, June 29 - July 1, 2013. IEEE Computer Society, 1–8. https://doi.org/10.1109/3DV.2013.9
- 3D Gaussian Splatting for Real-Time Radiance Field Rendering. ACM Trans. Graph. 42, 4 (2023), 139:1–139:14. https://doi.org/10.1145/3592433
- Gaussian Splatting SLAM. arXiv:2312.06741 [cs.CV]
- NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I (Lecture Notes in Computer Science, Vol. 12346), Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer, 405–421. https://doi.org/10.1007/978-3-030-58452-8_24
- Raúl Mur-Artal and Juan D. Tardós. 2017. ORB-SLAM2: an Open-Source SLAM System for Monocular, Stereo and RGB-D Cameras. IEEE Transactions on Robotics 33, 5 (2017), 1255–1262. https://doi.org/10.1109/TRO.2017.2705103
- KinectFusion: Real-time dense surface mapping and tracking. In 10th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2011, Basel, Switzerland, October 26-29, 2011. IEEE Computer Society, 127–136. https://doi.org/10.1109/ISMAR.2011.6092378
- Real-time 3D reconstruction at scale using voxel hashing. ACM Trans. Graph. 32, 6 (2013), 169:1–169:11. https://doi.org/10.1145/2508363.2508374
- Point-SLAM: Dense Neural Point Cloud-based SLAM. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV).
- BAD SLAM: Bundle Adjusted Direct RGB-D SLAM. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019. Computer Vision Foundation / IEEE, 134–144. https://doi.org/10.1109/CVPR.2019.00022
- Large-Scale Multi-resolution Surface Reconstruction from RGB-D Sequences. In IEEE International Conference on Computer Vision, ICCV 2013, Sydney, Australia, December 1-8, 2013. IEEE Computer Society, 3264–3271. https://doi.org/10.1109/ICCV.2013.405
- The Replica Dataset: A Digital Replica of Indoor Spaces. arXiv preprint arXiv:1906.05797 (2019).
- A benchmark for the evaluation of RGB-D SLAM systems. In 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems. 573–580. https://doi.org/10.1109/IROS.2012.6385773
- iMAP: Implicit Mapping and Positioning in Real-Time. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 6209–6218. https://doi.org/10.1109/ICCV48922.2021.00617
- Co-SLAM: Joint Coordinate and Sparse Parametric Encodings for Neural Real-Time SLAM. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2023, Vancouver, BC, Canada, June 17-24, 2023. IEEE, 13293–13302. https://doi.org/10.1109/CVPR52729.2023.01277
- ElasticFusion: Dense SLAM Without A Pose Graph. In Robotics: Science and Systems XI, Sapienza University of Rome, Rome, Italy, July 13-17, 2015, Lydia E. Kavraki, David Hsu, and Jonas Buchli (Eds.). https://doi.org/10.15607/RSS.2015.XI.001
- HRBF-Fusion: Accurate 3D Reconstruction from RGB-D Data Using On-the-Fly Implicits. 41, 3, Article 35 (apr 2022), 19 pages. https://doi.org/10.1145/3516521
- GS-SLAM: Dense Visual SLAM with 3D Gaussian Splatting. CoRR abs/2311.11700 (2023). https://doi.org/10.48550/ARXIV.2311.11700 arXiv:2311.11700
- Vox-Fusion: Dense Tracking and Mapping with Voxel-based Neural Implicit Representation. In IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2022, Singapore, October 17-21, 2022, Henry B. L. Duh, Ian Williams, Jens Grubert, J. Adam Jones, and Jianmin Zheng (Eds.). IEEE, 499–507. https://doi.org/10.1109/ISMAR55827.2022.00066
- Deformable 3D Gaussians for High-Fidelity Monocular Dynamic Scene Reconstruction. arXiv preprint arXiv:2309.13101 (2023).
- ScanNet++: A High-Fidelity Dataset of 3D Indoor Scenes. In Proceedings of the International Conference on Computer Vision (ICCV).
- Gaussian-SLAM: Photo-realistic Dense SLAM with Gaussian Splatting. arXiv:2312.10070 [cs.CV]
- Online Structure Analysis for Real-Time Indoor Scene Reconstruction. ACM Trans. Graph. 34, 5 (2015), 159:1–159:13. https://doi.org/10.1145/2768821
- NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).
- NICE-SLAM: Neural Implicit Scalable Encoding for SLAM. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022. IEEE, 12776–12786. https://doi.org/10.1109/CVPR52688.2022.01245
- State of the Art on 3D Reconstruction with RGB-D Cameras. Comput. Graph. Forum 37, 2 (2018), 625–652. https://doi.org/10.1111/CGF.13386
- Surface splatting. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’01). Association for Computing Machinery, New York, NY, USA, 371–378. https://doi.org/10.1145/383259.383300