Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
175 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Combining Local and Global Perception for Autonomous Navigation on Nano-UAVs (2403.11661v1)

Published 18 Mar 2024 in cs.RO, cs.SY, and eess.SY

Abstract: A critical challenge in deploying unmanned aerial vehicles (UAVs) for autonomous tasks is their ability to navigate in an unknown environment. This paper introduces a novel vision-depth fusion approach for autonomous navigation on nano-UAVs. We combine the visual-based PULP-Dronet convolutional neural network for semantic information extraction, i.e., serving as the global perception, with 8x8px depth maps for close-proximity maneuvers, i.e., the local perception. When tested in-field, our integration strategy highlights the complementary strengths of both visual and depth sensory information. We achieve a 100% success rate over 15 flights in a complex navigation scenario, encompassing straight pathways, static obstacle avoidance, and 90{\deg} turns.

Summary

  • The paper presents a novel fused approach that integrates CNN-based global perception with ToF sensor depth mapping, achieving a 100% success rate in navigation tests.
  • The methodology combines semantic analysis with real-time obstacle detection to overcome the constraints of individual sensing pipelines.
  • Experimental results in corridors and sharp turns confirm that the fusion strategy significantly outperforms standalone solutions in autonomous navigation.

Autonomous Navigation on Nano-UAVs: A Fusion of Depth and Vision Sensory Inputs

Introduction

The domain of autonomous nano-sized Unmanned Aerial Vehicles (UAVs) encompasses a variety of potential applications, ranging from inspection tasks in hazardous environments to inventory management within warehouse settings. These micro aerial vehicles bring unique advantages, notably their ability to safely operate in proximity to humans and their accessibility to confined spaces. A significant challenge in the deployment of such UAVs lies in the field of autonomous navigation, particularly in unknown environments. State-of-the-Art (SoA) systems typically rely on resource-intensive global and local planning techniques, making them unsuitable for the constrained computing capabilities of nano-UAVs. Thus, the focus has been on simpler, computationally feasible solutions. This paper introduces a novel approach that combines semantic information from a convolutional neural network (CNN) with depth maps for enhanced navigation capabilities, facing straight pathways, obstacle avoidance, and abrupt turns with unprecedented success.

System Design

The proposed system integrates two perception pipelines on a nano-UAV platform to facilitate autonomous navigation. The global perception pipeline leverages PULP-Dronet, a CNN designed for visual-based navigation, to extract semantic cues from the environment. It outputs a steering angle and a collision probability, translating these into control commands for the UAV. Conversely, the local perception pipeline utilizes an 8x8 pixel Time-of-Flight (ToF) sensor to generate depth maps indicative of immediate obstacles. This pipeline's outputs guide close-proximity maneuvers by identifying obstacle-free areas and directing the UAV accordingly.

A fusion mechanism combines these two pipelines, enhancing navigation efficiency by benefiting from the complementarity of depth-based and vision-based sensory inputs. The decision-making process incorporates outputs from both pipelines, adjusting the UAV's steering and speed to navigate complex environments successfully.

Experimental Setup and Results

The experimental evaluation, conducted in an office corridor, delineated scenarios involving straight pathways, static obstacles, and 90-degree turns. A comparative analysis of the performance of the global perception pipeline, the local perception pipeline, and the fused pipeline demonstrated the superiority of the fused approach. The global perception pipeline, while effective in obstacle-free environments, failed in scenarios involving obstacles. In contrast, the local perception pipeline managed obstacle avoidance but struggled with navigation decisions requiring semantic understanding, such as turning at corridor ends. The fused pipeline achieved a 100% success rate across all scenarios, effectively navigating straight corridors, avoiding obstacles, and executing turns, confirming the theoretical advantages of leveraging both depth and visual cues.

Conclusion and Future Implications

The research presents a leap forward in autonomy for nano-UAVs, underlining the feasibility of integrating global and local perception to navigate complex environments. The success of the fused pipeline not only enhances nano-UAVs' functional capabilities but also opens avenues for further exploration, such as dynamic obstacle avoidance and adaptation to diverse operational contexts. Future advancements may focus on refining the fusion mechanism, exploring scalable and more efficient deep learning models for real-time semantic processing, and expanding sensory inputs to enrich environmental understanding. This work demonstrates that even with constrained computational resources, it is possible to achieve advanced autonomous navigation by intelligently combining different sensory modalities.

Acknowledgments

Appreciation extends to D. Palossi and D. Christodoulou for their contributions and H. M\"uller for hardware support.

X Twitter Logo Streamline Icon: https://streamlinehq.com