DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization (2411.08373v1)

Published 13 Nov 2024 in cs.RO

Abstract: Achieving robust and precise pose estimation in dynamic scenes is a significant research challenge in Visual Simultaneous Localization and Mapping (SLAM). Recent advancements integrating Gaussian Splatting into SLAM systems have proven effective in creating high-quality renderings using explicit 3D Gaussian models, significantly improving environmental reconstruction fidelity. However, these approaches depend on a static environment assumption and face challenges in dynamic environments due to inconsistent observations of geometry and photometry. To address this problem, we propose DG-SLAM, the first robust dynamic visual SLAM system grounded in 3D Gaussians, which provides precise camera pose estimation alongside high-fidelity reconstructions. Specifically, we propose effective strategies, including motion mask generation, adaptive Gaussian point management, and a hybrid camera tracking algorithm to improve the accuracy and robustness of pose estimation. Extensive experiments demonstrate that DG-SLAM delivers state-of-the-art performance in camera pose estimation, map reconstruction, and novel-view synthesis in dynamic scenes, outperforming existing methods meanwhile preserving real-time rendering ability.

Citations (1)

View on Semantic Scholar

Summary

The paper presents DG-SLAM, which integrates 3D Gaussian splatting with hybrid pose optimization to effectively handle dynamic environments.
It employs a hybrid camera tracking strategy that combines DROID-SLAM odometry with coarse-to-fine optimization to enhance pose accuracy.
Adaptive Gaussian point management and motion mask generation yield state-of-the-art performance, reducing Absolute Trajectory Error in varied datasets.

DG-SLAM: Robust Dynamic Gaussian Splatting SLAM with Hybrid Pose Optimization

This paper introduces DG-SLAM, a novel visual SLAM system that specifically addresses the challenges posed by dynamic environments. Built upon the integration of 3D Gaussian splatting into SLAM systems, DG-SLAM showcases a robust framework capable of accurate pose estimation and high-fidelity reconstruction, surpassing previous static-environment-dependent SLAM approaches.

Key Contributions

Dynamic Scene Handling: DG-SLAM is claimed to be the first visual SLAM system grounded in 3D Gaussian models targeted explicitly at dynamic environments. Its robust performance in dynamic conditions is achieved by incorporating innovative techniques such as motion mask generation and adaptive Gaussian point management.
Hybrid Camera Tracking: The system employs a hybrid camera tracking strategy that blends DROID-SLAM odometry with a coarse-to-fine optimization protocol. This approach is pivotal for enhancing the consistency between estimated poses and reconstructed maps, facilitating improved accuracy in dynamic scenarios.
Efficient Map Management: An adaptive strategy for Gaussian point addition and pruning is proposed, ensuring cleanliness and geometric integrity in the generated maps. This adaptability is critical in managing the complexity and resources in dynamic scenes.
Motion Mask Generation: The authors introduce a method for generating motion masks through the combination of depth warp observations and semantic priors, which enhances precision in segmenting dynamic objects, contributing substantially to the system's robustness.

Experimental Results

DG-SLAM's performance is evaluated across diverse and challenging datasets including TUM RGB-D, BONN RGB-D Dynamic, and ScanNet. On dynamic scenes, DG-SLAM demonstrates superior accuracy and stability in pose estimation, with state-of-the-art outcomes in terms of Absolute Trajectory Error (ATE) when compared to both classical and neural-based SLAM systems. The system also exhibits commendable rendering and reconstruction performance, with high-quality outputs devoid of artifacts associated with dynamic objects.

Implications and Future Directions

DG-SLAM advances the applicability of SLAM in real-world scenarios, particularly autonomous navigation and augmented reality, where dynamic environments are commonplace. The integration of a robust motion mask generation strategy addresses a crucial gap in handling non-static elements. Moreover, the explicit use of 3D Gaussians for scene representation, along with an effective point management system, ensures that the SLAM system can maintain high fidelity and real-time performance.

The authors recognize limitations such as large-scale loop closure and dependence on semantic segmentation accuracy. Future research could explore more flexible loop closure methods and improved dynamic object perception techniques to further enhance system performance under varying conditions. Furthermore, extending the application of DG-SLAM to outdoor settings with even more dynamic elements can be a promising direction.

In conclusion, DG-SLAM represents a significant step forward in the development of dynamic SLAM systems, offering robust solutions for accurate navigation and mapping in environments previously considered challenging due to their dynamic nature.

PDF Markdown

Related Papers

Tweets

https://twitter.com/zhenjun_zhao/status/1857279320659755013