DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems (2503.04126v1)

Published 6 Mar 2025 in cs.RO, cs.CV, and cs.MA

Abstract: Cooperative Simultaneous Localization and Mapping (C-SLAM) enables multiple agents to work together in mapping unknown environments while simultaneously estimating their own positions. This approach enhances robustness, scalability, and accuracy by sharing information between agents, reducing drift, and enabling collective exploration of larger areas. In this paper, we present Decentralized Visual Monocular SLAM (DVM-SLAM), the first open-source decentralized monocular C-SLAM system. By only utilizing low-cost and light-weight monocular vision sensors, our system is well suited for small robots and micro aerial vehicles (MAVs). DVM-SLAM's real-world applicability is validated on physical robots with a custom collision avoidance framework, showcasing its potential in real-time multi-agent autonomous navigation scenarios. We also demonstrate comparable accuracy to state-of-the-art centralized monocular C-SLAM systems. We open-source our code and provide supplementary material online.

Summary

The paper introduces DVM-SLAM, the first open-source decentralized monocular C-SLAM system enabling peer-to-peer communication for multi-agent systems.
DVM-SLAM employs incremental, asynchronous pose graph optimization and demonstrates performance comparable to centralized systems on standard datasets.
The decentralized approach is highly practical for dynamic environments with limited communication and serves as a foundation for future distributed robotic autonomy research.

Overview of DVM-SLAM: Decentralized Visual Monocular Simultaneous Localization and Mapping for Multi-Agent Systems

The paper in question introduces DVM-SLAM, an innovative decentralized system for monocular Simultaneous Localization and Mapping (SLAM) optimized for use in multi-agent contexts. This approach mitigates the limitations inherent in centralized SLAM systems—such as reliance on a central server that introduces a single point of failure and scalability challenges—by enabling direct peer-to-peer communication among agents. The authors assert that decentralized systems can operate efficiently even in environments with limited or intermittent communication infrastructure, a critical improvement for autonomous multi-robot systems like drone swarms or small robots where network reliability cannot be guaranteed.

Key Contributions

DVM-SLAM's main contributions can be summarized in the following areas:

Decentralization and Open-Source Framework: It represents the first open-source implementation of a decentralized monocular C-SLAM system. The system is tailored for resource-constrained devices using only monocular vision, showcasing real-world adaptability without the requirement for resource-intensive sensors like LiDAR.
Incremental, Asynchronous Pose Graph Optimization: Instead of periodic optimization rounds, DVM-SLAM continuously refines pose graphs as data becomes available, minimizing necessary communication and enhancing operational robustness in diverse networking conditions.
Benchmarking and Performance Evaluation: The system demonstrates performance on par with centralized systems in key performance metrics, validated using standard datasets such as EuRoC Machine Hall and TUM-VI Rooms.

Methodological Insights

Technically, the system architecture comprises several components that work cohesively to enable distributed SLAM operations:

Agent Communication Model: Defines how agents maintain state and establish peer connections crucial for robust map sharing and collaboration without central control.
Map Merging: Integrates visual bag-of-words techniques to align and merge independent maps from multiple agents, adapting a dynamic baseline scoring system for flexible map coordination.
System State Machine: Automates SLAM operations across agents, transitioning through states like unmerged and merged based on activity, ensuring that changes in agent connectivity do not compromise system stability.
Map Alignment and Realignment: Continuous refinement through a SIM(3) transformation ensures map accuracy even with minimal initial overlap, using an additive increase and multiplicative decrease strategy to optimize alignment frequency.

Experimental Validation

The system achieves a root mean square absolute trajectory error (ATE) of 5.9 cm on the EuRoC machine hall dataset and 6.95 cm on the TUM-VI dataset. These results not only affirm the system's accuracy but also highlight its effective real-world applicability, achieving lower errors compared to some centralized counterparts under varying operational scenarios.

Implications and Future Directions

Practically, DVM-SLAM exemplifies how decentralized SLAM systems can offer robust solutions in dynamic environments where centralized communication is unfeasible. This characteristic is advantageous in applications such as autonomous UAV navigation in remote areas or indoor environments, where infrastructural constraints hinder traditional approaches.

Theoretically, the flexibility of decentralized incremental optimization aligns well with emerging trends in distributed computation and AI, posing exciting possibilities for further integration with sensor fusion technologies, such as Visual-Inertial Odometry (VIO).

Moving forward, exploring methodologies to incorporate inertial data for greater robustness and addressing challenges related to monocular SLAM's inherent scale ambiguity could vastly improve its efficacy. There is a particular interest in leveraging learning-based methods to enhance terrain adaptability and extend functionality in feature-poor environments.

In conclusion, DVM-SLAM represents a significant stride in advancing multi-agent SLAM systems towards real-world scalability and reliability, establishing a foundation for future research in decentralized robotic autonomy.

Tweets

https://twitter.com/zhenjun_zhao/status/1898346742388813873