Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
194 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

360ORB-SLAM: A Visual SLAM System for Panoramic Images with Depth Completion Network (2401.10560v1)

Published 19 Jan 2024 in cs.CV

Abstract: To enhance the performance and effect of AR/VR applications and visual assistance and inspection systems, visual simultaneous localization and mapping (vSLAM) is a fundamental task in computer vision and robotics. However, traditional vSLAM systems are limited by the camera's narrow field-of-view, resulting in challenges such as sparse feature distribution and lack of dense depth information. To overcome these limitations, this paper proposes a 360ORB-SLAM system for panoramic images that combines with a depth completion network. The system extracts feature points from the panoramic image, utilizes a panoramic triangulation module to generate sparse depth information, and employs a depth completion network to obtain a dense panoramic depth map. Experimental results on our novel panoramic dataset constructed based on Carla demonstrate that the proposed method achieves superior scale accuracy compared to existing monocular SLAM methods and effectively addresses the challenges of feature association and scale ambiguity. The integration of the depth completion network enhances system stability and mitigates the impact of dynamic elements on SLAM performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (10)
  1. A. J. Davison, I. D. Reid, N. D. Molton and O. Stasse, “MonoSLAM: Real-Time Single Camera SLAM,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 29, no. 6, pp. 1052-1067, 2007.
  2. Z. Zhang, H. Rebecq, C. Forster and D. Scaramuzza, “Benefit of large field-of-view cameras for visual odometry,” 2016 IEEE International Conference on Robotics and Automation (ICRA), Stockholm, 2016, pp. 801-808.
  3. R. Mur-Artal and J. D. Tardós, “ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras,” in IEEE Transactions on Robotics, vol. 33, no. 5, pp. 1255-1262, Oct. 2017.
  4. J. Engel, V. Koltun and D. Cremers, “Direct Sparse Odometry,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 40, no. 3, pp. 611-625, 1 March 2018.
  5. J. Engel, T. Sch ops, and D. Cremers, “Lsd-slam: Large-scale direct monocular slam,” in European conference on computer vision. Springer, 2014, pp. 834–849.
  6. S. Sumikura, M. Shibuya, and K. Sakurada, “OpenvSLAM: a versatile visual slam framework,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 2292–2295.
  7. H. Matsuki, L. von Stumberg, V. Usenko, J. Stückler and D. Cremers, “Omnidirectional DSO: Direct Sparse Odometry With Fisheye Cameras,” in IEEE Robotics and Automation Letters, vol. 3, no. 4, pp. 3693-3700, Oct. 2018.
  8. C. Campos, R. Elvira, J. J. G. Rodríguez, J. M. M. Montiel and J. D. Tardós, “ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual–Inertial, and Multimap SLAM,” in IEEE Transactions on Robotics, vol. 37, no. 6, pp. 1874-1890, Dec. 2021.
  9. J. Kannala and S. S. Brandt, “A generic camera model and calibration method for conventional, wide-angle, and fish-eye lenses,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 8, pp. 1335-1340, Aug. 2006.
  10. A. Eldesokey, M. Felsberg and F. S. Khan, “Confidence Propagation through CNNs for Guided Sparse Depth Regression,” in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 10, pp. 2423-2436, 1 Oct. 2020.
Citations (3)

Summary

  • The paper introduces 360ORB-SLAM, a visual SLAM system that integrates panoramic triangulation with a depth completion network to significantly improve scale estimation.
  • It employs a depth completion network to convert sparse depth data into dense maps, reducing tracking failures in dynamic lighting and high-motion scenarios.
  • Experimental results in the Carla simulation environment confirm the system's enhanced pose estimation and robustness, underscoring its potential for AR/VR and autonomous driving applications.

Introduction

Simultaneous Localization and Mapping, commonly referred to as SLAM, is a crucial technological element in the fields of robotics and computer vision, with significant implications for AR/VR applications, autonomous driving, and robot navigation. SLAM’s primary objective is to generate maps of an unknown environment while concurrently estimating the location of an agent within it. A key challenge in developing robust vSLAM systems is dealing with the limitations posed by monocular cameras, such as scale uncertainty and narrow fields of view. To address these challenges, a new SLAM system tailored for panoramic images, titled 360ORB-SLAM, has been introduced, equipped with a depth completion network aimed at improving scale accuracy and overall system performance.

Methodology

At the core of 360ORB-SLAM is the integration of a panoramic triangulation module with a state-of-the-art depth completion network. The triangulation module generates sparse depth information by managing the unique characteristics of panoramic cameras that otherwise tend to struggle with large FOVs and non-linear distortions. Further enhancement of the system’s capabilities is achieved by introducing a depth completion network that transforms sparse depth information into dense depth maps. Through this process, the system enhances stability and reduces the influence of dynamic elements, providing a more robust and accurate SLAM experience.

Experimental Results

The system's effectiveness was rigorously tested using a specially constructed panoramic dataset, rendered in the Carla simulation environment. 360ORB-SLAM demonstrated superior performance in scale accuracy and showed promise in resolving issues commonly experienced with monocular systems, such as tracking failures due to rapid motion or changing lighting conditions. The depth completion network played a pivotal role in this success by generating dense depth maps that significantly reduced scale drift and improved pose estimation accuracy. The quantitative results underscored the system's robustness, maintaining performance across various sequences and environmental conditions.

Implications and Future Work

The introduction of 360ORB-SLAM signals a step forward in overcoming the constraints of traditional monocular vSLAM systems. The system's performance in feature detection, scale accuracy, and environmental mapping highlights its potential for a range of applications, from enhancing AR/VR experiences to bolstering the reliability of autonomous vehicles. While the current focus is on the application of the system within intelligent driving scenarios, the methodologies and results discussed in this paper lay the groundwork for future advancements and optimizations that can be extended to other domains and real-world implementations.