- The paper presents a novel MAP-based visual-inertial SLAM system that achieves 2-10x accuracy improvements over previous methods.
- The system integrates multi-map management with enhanced place recognition to effectively recover from long periods of visual degradation.
- ORB-SLAM3 supports diverse sensor configurations including monocular, stereo, and RGB-D, enabling versatile applications in autonomous navigation and AR/VR.
Overview of ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial and Multi-Map SLAM
Introduction
ORB-SLAM3 aims to advance the capabilities of Simultaneous Localization and Mapping (SLAM) by integrating visual, visual-inertial, and multi-map functionalities into a robust open-source library. This paper, authored by Carlos Campos, Richard Elvira, Juan J. Gómez Rodríguez, José M. M. Montiel, and Juan D. Tardós, presents a comprehensive system capable of monocular, stereo, and RGB-D configurations, using both pin-hole and fisheye lens models. The primary contributions lie in the development of a tightly-integrated visual-inertial SLAM system and a multiple map management system, underpinned by robust place recognition methods.
Key Contributions
- Visual-Inertial SLAM: ORB-SLAM3 incorporates a novel feature-based visual-inertial SLAM system that relies entirely on Maximum-a-Posteriori (MAP) estimation, effective from the IMU initialization phase. This approach ensures robust real-time operation across various environments, achieving accuracies two to ten times better than prior methods.
- Multiple Map Management: The system introduces an enhanced place recognition method that significantly improves recall, enabling ORB-SLAM3 to handle long periods of visual degradation by starting new maps that seamlessly integrate with prior maps upon revisitation.
- Comprehensive Configurations: ORB-SLAM3 supports a range of sensor configurations—monocular, stereo, RGB-D—and various lens models, providing versatility across different application scenarios.
Experimental Results
The system's robustness and accuracy were empirically validated using the EuRoC dataset. The experiments demonstrated that ORB-SLAM3 consistently outperforms leading contemporary SLAM systems in all sensor configurations.
- Monocular SLAM: Compared to ORB-SLAM2 and DSO, ORB-SLAM3 showed significantly enhanced robustness and precision, effectively managing challenging sequences and tracking losses.
- Stereo SLAM: The stereo configurations of ORB-SLAM3 yielded accuracies up to four times better than VINS-Fusion and SVO.
- Monocular-Inertial SLAM: The monocular-inertial version surpassed MSCKF, OKVIS, and ROVIO, achieving superior robustness and precision, especially in complex sequences.
- Stereo-Inertial SLAM: This configuration achieved top-tier accuracy surpassing BASALT, effectively managing even the sequences with missing frames from the EuRoC dataset.
Theoretical and Practical Implications
ORB-SLAM3's contributions extend both theoretical and practical frontiers in SLAM research. The integration of comprehensive data associations—short-term, mid-term, long-term, and multi-map—addresses fundamental challenges in SLAM. The MAP estimation for visual-inertial initialization offers robust and rapid sensor calibration, vastly improving practical deployment scenarios.
From a practical perspective, ORB-SLAM3's public release as an open-source library facilitates its adoption and further improvement by the research community. The system's performance under various configurations—monocular, stereo, mono-inertial, and stereo-inertial—provides flexibility for diverse applications ranging from autonomous navigation to augmented and virtual reality (AR/VR).
Future Directions
Future research might explore photometric methods to enhance the system's performance in low-texture environments, addressing one of ORB-SLAM3's key limitations. Additionally, investigating hybrid techniques that combine the advantages of both feature-based and direct methods could further improve robustness and accuracy in diverse scenarios.
Conclusion
ORB-SLAM3 sets a new benchmark in SLAM systems by providing a robust, accurate, and versatile solution that integrates visual and inertial data across multiple maps. Its exceptional performance in empirical evaluations underscores its potential for broad application in both research and industry.
By combining state-of-the-art MAP estimation, robust place recognition, and comprehensive support for multiple sensor configurations, ORB-SLAM3 stands as a significant advancement in the field of visual and visual-inertial SLAM. This work not only enhances the understanding of SLAM system design but also catalyzes further research and development in creating more efficient and reliable autonomous navigation systems.