- The paper presents the U-Scene dataset paired with a novel Gaussian Splatting approach to address challenges in large-scale 3D reconstruction.
- It fuses high-accuracy LiDAR data and RGB images, optimizing coordinate alignment for enhanced reconstruction fidelity.
- Qualitative results indicate improved urban scene synthesis, underscoring the method’s potential for city-scale 3D mapping.
Introduction
3D reconstruction using Neural Radiance Fields (NeRF) has been evolving rapidly, leading to significant strides in the ability to synthesize novel views from sparse 2D image data. Recent advancements optimize various aspects of NeRF, including techniques from meta-learning and sparsity exploitation. Nonetheless, NeRF models still face challenges such as difficulty in training and limitations in handling large-scale and complex scenes. The Gaussian Splatting approach introduces a new way of representing 3D scenes that combines rasterization and view synthesis, proving particularly useful for large-scale scenarios.
Dataset and Acquisition Methodology
The paper presents a comprehensive RGB dataset named U-Scene, which includes corresponding LiDAR ground truth data—overcoming the limitations of current datasets that either lack accurate ground truth data or are not designed for reconstruction. The Matrix 300 drone equipped with high-accuracy Zenmuse L1 LiDAR captures extensive data, covering 1.5 km2 of urban and academic environments. This dataset not only addresses issues such as image time differences but also incorporates rooftop information, which is typically missing in existing datasets.
Gaussian Splatting and Lidar-Image Fusion
Gaussian Splatting, a recent 3D representation approach, is analyzed for its efficacy and limitations in large-scale scene reconstruction. The process involves rasterization techniques that synthesize novel views, using drones to collect RGB and LiDAR data to obtain a precise final point cloud. The paper investigates the potential of combining LiDAR data with image-based Gaussian Splatting, yielding enhancements in reconstruction accuracy. The fusion process involves scaling down the dense LiDAR point clouds to the optimal level for Gaussian Splatting, merging this point cloud data with the image information, and presenting a novel technique to address the coordinate alignment challenge between LiDAR and images.
Results and Considerations for Future Research
The results, evaluated on the U-Scene dataset using both vanilla Gaussian Splatting and Lidar-Fused Gaussian Splatting, reveal modest quantitative improvements when employing LiDAR fusion. However, qualitative improvements are more pronounced, suggesting that image data alone may not aptly represent 3D structures, and a fusion approach may yield more faithful reconstructions. The dataset shows the potential for improvements in city-scale reconstructions when leveraging drone-collected data, thereby underscoring the necessity of including point cloud ground truth in the evaluation. Future work may explore enhancing the edge effects in Gaussian Splatting to further reduce reconstruction error and develop methodologies to generate larger datasets that efficiently leverage both point cloud and image data.