UAV-assisted Visual SLAM Generating Reconstructed 3D Scene Graphs in GPS-denied Environments (2402.07537v1)

Published 12 Feb 2024 in cs.RO

Abstract: Aerial robots play a vital role in various applications where the situational awareness of the robots concerning the environment is a fundamental demand. As one such use case, drones in GPS-denied environments require equipping with different sensors (e.g., vision sensors) that provide reliable sensing results while performing pose estimation and localization. In this paper, reconstructing the maps of indoor environments alongside generating 3D scene graphs for a high-level representation using a camera mounted on a drone is targeted. Accordingly, an aerial robot equipped with a companion computer and an RGB-D camera was built and employed to be appropriately integrated with a Visual Simultaneous Localization and Mapping (VSLAM) framework proposed by the authors. To enhance the situational awareness of the robot while reconstructing maps, various structural elements, including doors and walls, were labeled with printed fiducial markers, and a dictionary of the topological relations among them was fed to the system. The VSLAM system detects markers and reconstructs the map of the indoor areas enriched with higher-level semantic entities, including corridors and rooms. Another achievement is generating multi-layered vision-based situational graphs containing enhanced hierarchical representations of the indoor environment. In this regard, integrating VSLAM into the employed drone is the primary target of this paper to provide an end-to-end robot application for GPS-denied environments. To show the practicality of the system, various real-world condition experiments have been conducted in indoor scenarios with dissimilar structural layouts. Evaluations show the proposed drone application can perform adequately w.r.t. the ground-truth data and its baseline.

Citations (1)

View on Semantic Scholar

Summary

The paper presents a UAV-assisted VSLAM system integrating ORB-SLAM3 with RGB-D cameras and markers to enable real-time, semantic indoor mapping.
The method reduces pose estimation errors by approximately 6.99% in RMSE and 12.21% in mean error compared to baseline techniques.
The approach generates hierarchical 3D scene graphs that enhance structural understanding and support autonomous navigation in complex, GPS-denied settings.

UAV-assisted Visual SLAM Generating Reconstructed 3D Scene Graphs in GPS-denied Environments

This paper presents an innovative approach to implementing Visual Simultaneous Localization and Mapping (VSLAM) via UAVs for accurate and informative mapping in environments where GPS signals are inaccessible. The research seeks to address the challenge of navigating and mapping indoor spaces without GPS by leveraging Unmanned Aerial Vehicles (UAVs) equipped with RGB-D cameras and other necessary computational components, integrating these into an elaborate VSLAM framework.

Key Components and Contributions

Integration of VSLAM into UAVs: The work describes the integration of a VSLAM framework, which extends ORB-SLAM3, with aerial robotic systems to perform real-time localization and mapping in GPS-denied environments. This system utilizes robo-centric RGB-D sensor data and markers attached to structural elements, facilitating both pose estimation and semantic mapping.
3D Scene Graph Generation: The system goes beyond traditional map creation by generating hierarchical 3D scene graphs that incorporate semantic information—specifically room layouts marked with fiducial markers, which enrich the metadata associated with different structural components like walls and doorways. This facilitates a high-level understanding of the environment that is invaluable for various applications.
Marker-Based Enhancement: By employing fiducial markers, the VSLAM system can significantly mitigate the typical feature extraction and pose estimation challenges presented by homogeneous surfaces such as plaster walls and shiny floors. The application of markers enhances visualization by ensuring high fidelity in structural and object-level data representation.
Scene Understanding and Optimization: The paper outlines improvements in the system's capability to not only detect and map structural entities—corridors, rooms—but also to perform on-the-fly optimizations using multi-layered scene graphs. This adds an additional layer of situational awareness, which is crucial for autonomous task execution in complex and dynamic environments.

Experimental Evaluation and Performance

The system was evaluated through extensive experimentation within various indoor setups, including environments with nested rooms and different structural layouts, providing a thorough assessment of its robustness and adaptability. Noteworthy is the system's performance in a controlled drone-testing arena equipped with a motion capture system that provides ground-truth data for real-time evaluation. The integration of fiducial markers was shown to improve RMSE by approximately 6.99% and mean error values by 12.21% over baseline ORB-SLAM3 while ensuring reliable mapping in the absence of GPS data.

Implications and Future Directions

The implications of this research extend to several domains, such as autonomous navigation systems for drones in urban canyons, search and rescue operations in GPS-starved environments, and complex industrial inspections where traditional SLAM systems face limitations. Furthermore, the adaptability of this system suggests future research could explore full autonomy in UAVs, powered by deep integration with path planning and dynamic obstacle avoidance protocols. With the improvement in marker-based recognition frameworks, this work also opens avenues for hybrid sensor fusion techniques that diversify the sensing modalities for more robust decision-making under uncertainty.

Overall, the paper contributes a comprehensive UAV-assisted VSLAM solution that pushes the envelope in semantic 3D mapping and the autonomous capabilities of aerial robotics under challenging navigation conditions.

PDF Markdown

Related Papers

Tweets

https://twitter.com/zhenjun_zhao/status/1757267334278832579

YouTube

Show All Videos