Loosely-Coupled Semi-Direct Monocular SLAM (1807.10073v3)

Published 26 Jul 2018 in cs.CV and cs.RO

Abstract: We propose a novel semi-direct approach for monocular simultaneous localization and mapping (SLAM) that combines the complementary strengths of direct and feature-based methods. The proposed pipeline loosely couples direct odometry and feature-based SLAM to perform three levels of parallel optimizations: (1) photometric bundle adjustment (BA) that jointly optimizes the local structure and motion, (2) geometric BA that refines keyframe poses and associated feature map points, and (3) pose graph optimization to achieve global map consistency in the presence of loop closures. This is achieved in real-time by limiting the feature-based operations to marginalized keyframes from the direct odometry module. Exhaustive evaluation on two benchmark datasets demonstrates that our system outperforms the state-of-the-art monocular odometry and SLAM systems in terms of overall accuracy and robustness.

Authors (2)

Seong Hun Lee (14 papers)
Javier Civera (62 papers)

Citations (60)

View on Semantic Scholar

Summary

Loosely-Coupled Semi-Direct Monocular SLAM: A Comprehensive Overview

The paper "Loosely-Coupled Semi-Direct Monocular SLAM" by Seong Hun Lee and Javier Civera addresses a significant advancement in monocular Simultaneous Localization and Mapping (SLAM) by leveraging the complementary advantages of direct and feature-based methods. This novel approach facilitates real-time performance with enhanced accuracy and robustness in monocular SLAM applications, particularly applicable in technologies such as autonomous vehicles and augmented reality. The work involved an integration of Direct Sparse Odometry (DSO) and ORB-SLAM, representing state-of-the-art direct and feature-based methods, respectively.

Methodological Innovations

The core innovation of the paper is the semi-direct SLAM pipeline that operates via a loosely-coupled architecture—a division into direct and feature-based modules performing parallel optimizations. Key aspects include:

Direct Odometry Module: Operating in real-time, the direct module implements DSO for rapid, robust tracking using photometric bundle adjustment. It creates a semi-dense map using raw pixel intensity without relying on feature extraction, making it particularly robust in low-texture environments marked by variations in illumination or viewpoint.
Feature-Based SLAM Module: This module extends map usability by refining keyframe poses and reconstructing a sparse feature map suitable for loop closure detection and performing pose graph optimization. Unlike the direct odometry, this module relies on the ORB feature-type descriptors enabling wide-baseline matching crucial for global map consistency.

The intricate design captures three layers of optimization:

Local Optimization: Direct photometric bundle adjustment optimizes structure and motion in minimal windows.
Intermediate Geometric Adjustment: Feature-based geometric bundle adjustment refines the keyframes and map points for precise representation.
Global Map Optimization: Achieving map consistency through pose graph optimization that accounts for loop closure events.

Evaluation and Results

The authors performed exhaustive testing on publicly available datasets including EuRoC MAV and TUM monoVO to benchmark the proposed system against standard monocular SLAM approaches such as DSO, ORB-SLAM, and variations thereof. The evaluation metrics emphasized absolute trajectory errors and alignment errors, capturing both accuracy and robustness across diverse scenarios.

Results revealed that the novel approach surpassed existing methodologies, exhibiting:

Enhanced robustness akin to direct methods while maintaining accuracy afforded by feature-based techniques.
Reduced drift and improved keyframe reduction particularly evident in exploratory sequences and loop closures.
Scalability advantages through sparse keyframe utilization translating to reduced computational overhead.

Practical and Theoretical Implications

This advancement bears noteworthy implications for the development of efficient SLAM systems, offering practical benefits in areas demanding low computational footprint without sacrificing accuracy or robustness. The methodology reinforces the importance of hybrid approaches, setting a precedent for future explorations into integrating more adaptive systems combining other sensory dimensions or leveraging machine learning for dynamic environment modeling.

Speculation on Future Developments

As SLAM technology continues evolving, future directions could explore tighter coupling mechanisms, incorporating real-time adaptive learning techniques to dynamically weight contributions from direct and feature-based observations. Further, extension towards multi-modal SLAM systems incorporating additional sensors (e.g., depth, inertial) could open pathways to more robust and scalable localization and mapping solutions across diverse domains. Integration of deep learning models may offer paths to refine error metrics and map representations adaptively, tailoring real-time performance to specific contextual demands.

In summary, Lee and Civera's semi-direct monocular SLAM framework represents a pivotal step in visual odometry, combining real-time performance with high precision and opening avenues for advanced autonomous navigation systems. Their work underscores significant strides in leveraging complementary SLAM paradigms, ensuring robust operation in challenging visual settings pivotal for future intelligent system applications.

Related Papers

YouTube

Show All Videos