3DGS-ReLoc: 3D Gaussian Splatting for Map Representation and Visual ReLocalization (2403.11367v1)
Abstract: This paper presents a novel system designed for 3D mapping and visual relocalization using 3D Gaussian Splatting. Our proposed method uses LiDAR and camera data to create accurate and visually plausible representations of the environment. By leveraging LiDAR data to initiate the training of the 3D Gaussian Splatting map, our system constructs maps that are both detailed and geometrically accurate. To mitigate excessive GPU memory usage and facilitate rapid spatial queries, we employ a combination of a 2D voxel map and a KD-tree. This preparation makes our method well-suited for visual localization tasks, enabling efficient identification of correspondences between the query image and the rendered image from the Gaussian Splatting map via normalized cross-correlation (NCC). Additionally, we refine the camera pose of the query image using feature-based matching and the Perspective-n-Point (PnP) technique. The effectiveness, adaptability, and precision of our system are demonstrated through extensive evaluation on the KITTI360 dataset.
- B. Kerbl, G. Kopanas, T. Leimkühler, and G. Drettakis, “3D Gaussian Splatting for Real-Time Radiance Field Rendering,” vol. 42, no. 4, p. 1.
- Y. Liao, J. Xie, and A. Geiger, “KITTI-360: A Novel Dataset and Benchmarks for Urban Scene Understanding in 2D and 3D,” vol. 45, no. 3, pp. 3292–3310.
- C. Chen, B. Wang, C. X. Lu, N. Trigoni, and A. Markham, “Deep Learning for Visual Localization and Mapping: A Survey,” pp. 1–21.
- E. Sucar, S. Liu, J. Ortiz, and A. J. Davison, “iMAP: Implicit Mapping and Positioning in Real-Time,” pp. 6229–6238.
- Z. Zhu, S. Peng, V. Larsson, W. Xu, H. Bao, Z. Cui, M. R. Oswald, and M. Pollefeys, “NICE-SLAM: Neural Implicit Scalable Encoding for SLAM.”
- E. Brachmann, A. Krull, S. Nowozin, J. Shotton, F. Michel, S. Gumhold, and C. Rother, “DSAC - Differentiable RANSAC for Camera Localization,” pp. 6684–6692.
- Z. Laskar, I. Melekhov, S. Kalia, and J. Kannala, “Camera Relocalization by Computing Pairwise Relative Poses Using Convolutional Neural Network,” pp. 929–938.
- P.-E. Sarlin, A. Unagar, M. Larsson, H. Germain, C. Toft, V. Larsson, M. Pollefeys, V. Lepetit, L. Hammarstrand, F. Kahl, and T. Sattler, “Back to the Feature: Learning Robust Camera Localization From Pixels To Pose,” pp. 3247–3257.
- M. R. U. Saputra, A. Markham, and N. Trigoni, “Visual SLAM and Structure from Motion in Dynamic Environments: A Survey,” vol. 51, no. 2, pp. 37:1–37:36.
- D. Werner, A. Al-Hamadi, and P. Werner, “Truncated Signed Distance Function: Experiments on Voxel Size,” in Image Analysis and Recognition, ser. Lecture Notes in Computer Science, A. Campilho and M. Kamel, Eds. Springer International Publishing, pp. 357–364.
- R. W. Wolcott and R. M. Eustice, “Visual localization within LIDAR maps for automated urban driving,” in 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems, pp. 176–183.
- D. DeTone, T. Malisiewicz, and A. Rabinovich, “SuperPoint: Self-Supervised Interest Point Detection and Description,” pp. 224–236.
- B. Wang, C. Chen, Z. Cui, J. Qin, C. X. Lu, Z. Yu, P. Zhao, Z. Dong, F. Zhu, N. Trigoni, and A. Markham, “P2-Net: Joint Description and Detection of Local Features for Pixel and Point Matching.”
- P. Jiang and S. Saripalli, “Contrastive Learning of Features between Images and LiDAR,” in 2022 IEEE 18th International Conference on Automation Science and Engineering (CASE), pp. 411–417.
- N. Max, “Optical models for direct volume rendering,” vol. 1, no. 2, pp. 99–108.
- M. Zwicker, J. Pfister, H.Baar, and M. Gross, “EWA volume splatting,” in Proceedings Visualization, 2001. VIS ’01., pp. 29–538.
- Y. Hiasa, Y. Otake, M. Takao, T. Matsuoka, K. Takashima, A. Carass, J. L. Prince, N. Sugano, and Y. Sato, “Cross-modality image synthesis from unpaired data using cyclegan,” in International Workshop on Simulation and Synthesis in Medical Imaging. Springer, 2018, pp. 31–41.
- P. Lindenberger, P.-E. Sarlin, and M. Pollefeys, “LightGlue: Local Feature Matching at Light Speed,” pp. 17 627–17 638.
- P. Jiang, P. Osteen, and S. Saripalli, “SemCal: Semantic LiDAR-Camera Calibration using Neural Mutual Information Estimator,” in 2021 IEEE International Conference on Multisensor Fusion and Integration for Intelligent Systems (MFI). IEEE, pp. 1–7.