DDN-SLAM: Real-time Dense Dynamic Neural Implicit SLAM (2401.01545v2)

Published 3 Jan 2024 in cs.CV and cs.RO

Abstract: SLAM systems based on NeRF have demonstrated superior performance in rendering quality and scene reconstruction for static environments compared to traditional dense SLAM. However, they encounter tracking drift and mapping errors in real-world scenarios with dynamic interferences. To address these issues, we introduce DDN-SLAM, the first real-time dense dynamic neural implicit SLAM system integrating semantic features. To address dynamic tracking interferences, we propose a feature point segmentation method that combines semantic features with a mixed Gaussian distribution model. To avoid incorrect background removal, we propose a mapping strategy based on sparse point cloud sampling and background restoration. We propose a dynamic semantic loss to eliminate dynamic occlusions. Experimental results demonstrate that DDN-SLAM is capable of robustly tracking and producing high-quality reconstructions in dynamic environments, while appropriately preserving potential dynamic objects. Compared to existing neural implicit SLAM systems, the tracking results on dynamic datasets indicate an average 90% improvement in Average Trajectory Error (ATE) accuracy.

References (21)

Citations (8)

View on Semantic Scholar

Summary

The paper introduces DDN-SLAM, a semantic SLAM system that integrates depth-guided static masks with optical flow and semantic information to improve tracking in dynamic environments.
It achieves real-time dense mapping and tracking at 20-30Hz using versatile input configurations like monocular, stereo, and RGB-D cameras.
Experimental results demonstrate superior accuracy, computational efficiency, and robustness compared to state-of-the-art methods, indicating strong potential for mobile platform deployment.

Overview of DDN-SLAM

DDN-SLAM is introduced as a semantic Simultaneous Localization and Mapping (SLAM) system that operates effectively in dynamic environments. Traditional neural implicit SLAM systems excel in static scenarios but struggle with dynamic interferences such as moving objects. DDN-SLAM incorporates semantic information to improve tracking and mapping quality in scenes affected by these dynamic elements.

Real-time Dense Mapping and Tracking

The system employs depth-guided static masks and multi-resolution hashing encoding to swiftly fill in missing data and produce high-quality maps without artifacts from dynamic objects. It uses features attested by optical flow and semantic information to perform robust tracking. Additionally, the system supports various input configurations, such as monocular, stereo, and RGB-D cameras, maintaining a reliable operation at a frequency of 20-30Hz.

Handling Dynamic Environments

Addressing challenges presented by dynamic objects, DDN-SLAM filters and segments static and dynamic features within the scene. It skips computing the fundamental matrix, a complex task in traditional Visual SLAM (VSLAM) methods, and instead identifies dynamic points through a combination of depth and optical flow anomalies. The ability to differentiate between foreground and background points with a high degree of accuracy allows the system to focus on static elements for initial pose estimation and mapping.

Experimental Validation

Extensive testing was conducted across multiple datasets, comprising both virtual and real-world scenarios, demonstrating superior tracking and mapping capabilities compared to existing state-of-the-art approaches. The system showed resilience across various dynamic and static scenes, excelling in conditions where traditional neural implicit SLAM systems and even some traditional SLAM methods falter.

Implementation and Efficiency

DDN-SLAM is implemented on high-performance hardware, and the experiments highlighted its computational efficiency relative to its contemporaries. It shows significantly improved memory usage and operational speed, suggesting suitability for deployment on lighter-weight platforms. Maintaining stable operation across a wealth of scenarios, DDN-SLAM presents itself as a strong candidate for various real-time applications that necessitate accurate and robust 3D scene reconstruction.

Limitations and Future Work

While the system's potential for mobile platforms like robots is clear, the paper suggests that better depth estimation and enhancements in maintaining spatio-temporal consistency could lead to further improvements. Future research might focus on optimizing the system for even better detail capture and optimization for global consistency in mapping results.

PDF Markdown

Related Papers

Tweets

https://twitter.com/1565330182176911367/status/1742881984731623783