Deep Learning for UAV-based Object Detection and Tracking: A Survey (2110.12638v1)

Published 25 Oct 2021 in cs.CV

Abstract: Owing to effective and flexible data acquisition, unmanned aerial vehicle (UAV) has recently become a hotspot across the fields of computer vision (CV) and remote sensing (RS). Inspired by recent success of deep learning (DL), many advanced object detection and tracking approaches have been widely applied to various UAV-related tasks, such as environmental monitoring, precision agriculture, traffic management. This paper provides a comprehensive survey on the research progress and prospects of DL-based UAV object detection and tracking methods. More specifically, we first outline the challenges, statistics of existing methods, and provide solutions from the perspectives of DL-based models in three research topics: object detection from the image, object detection from the video, and object tracking from the video. Open datasets related to UAV-dominated object detection and tracking are exhausted, and four benchmark datasets are employed for performance evaluation using some state-of-the-art methods. Finally, prospects and considerations for the future work are discussed and summarized. It is expected that this survey can facilitate those researchers who come from remote sensing field with an overview of DL-based UAV object detection and tracking methods, along with some thoughts on their further developments.

View on arXiv

Authors (5)

Xin Wu (134 papers)
Wei Li (1122 papers)
Danfeng Hong (65 papers)
Ran Tao (82 papers)
Qian Du (50 papers)

Citations (155)

View on Semantic Scholar

Summary

Deep Learning for UAV-based Object Detection and Tracking: Expert Analysis

The paper "Deep Learning for UAV-based Object Detection and Tracking: A Survey" presents a comprehensive review exploring the intersection of unmanned aerial vehicles (UAVs) with computer vision (CV) and remote sensing (RS). UAVs, due to their versatile data acquisition capabilities, have gained significant attention in diverse applications including environmental monitoring, urban planning, and disaster management. This paper catalogues recent developments in deep learning approaches tailored for object detection and tracking in UAV data, structured around three core topics: object detection from images, video object detection, and multiple object tracking.

Object Detection from Images

The paper segments deep learning methods addressing UAV-borne image object detection into several sub-challenges. These include issues stemming from scale diversity, small object detection, directional diversity, and real-time processing requirements.

Scale Diversity: Multi-scale feature extraction techniques commonly utilize multi-scale feature maps and dilated/deformable convolution kernels to handle varying object sizes efficiently. Methods like RRNet and HRDNet demonstrate significant progress by ensuring robust detection across spatial scale variability.
Small Object Detection: Addressing small-scale objects requires enhanced feature learning, possibly through network architectures emphasized by RRNet and FS-SSD. Approaches like perceptual GANs enhance objects’ visibility, mimicking larger counterparts for improved detection accuracy.
Directional Diversity: Rotational invariant network designs cater to the orientation-specific nature of UAV images. By adopting augmentation and advanced pooling layers (e.g., Fisher discriminative pooling), models manage the orientation variation effectively.
Real-time Processing: Lightweight models such as slimYOLOv3 trim traditional architectures to meet real-time demands without drastically sacrificing accuracy, a crucial aspect for UAV applications needing immediate data interpretation.

Video Object Detection

Video object detection involves amending detection results using information spread across timeframes, facilitated by advanced techniques like optical flow and memory networks.

Optical Flow-based Networks use the temporal motion information to collate adjacent frame data, refining the accuracy and robustness against motion blur and dynamic environmental conditions, authored effectively in FGFA and TDFA methods.
Memory Networks like ConvLSTM marry frame-level data with long-term memory traces to refine object detection over variable spatial and temporal contexts, presenting significant contributions in SCNN and advanced LSTM variants.

Multiple Object Tracking

Multiple Object Tracking (MOT) from UAV-based video employs stratified methods such as Tracking-by-Detection (TBD), Single Object Tracking (SOT) assisted MOT, and memory networks.

Tracking-by-Detection methods like SORT and Deep SORT apply pre-trained detection frameworks coupled with efficient data association strategies to track objects across frames, though sometimes at the cost of performance with rapid object movements.
SOT-assisted MOT approaches utilize individual object predictions to aid tracking mechanisms, especially under fast motion scenarios. Usage of Siamese networks facilitates similarity measurement aiding effective association.
Memory Networks incorporate learned historical trajectories of the objects using LSTM architectures, establishing connections across temporal spans to enhance tracking amidst complex scenarios.

Implications and Future Insights

This paper solidifies the foundational understanding and showcases the prominent strides in UAV-based object detection and tracking through deep learning methods. Practical implications include improved accuracy across diverse environmental settings and object scales, culminating in effective applications in real-world scenarios from agriculture to security surveillance.

Looking forward, ongoing developments may encompass multi-modal sensor integration on UAV platforms, combining infrared, multispectral, and hyperspectral information to further refine detection and tracking performances across variable contexts and climates. The paper anticipates potential advancements in computational efficiency, advocating for more optimized deep learning models geared towards embedded and mobile platforms. Furthermore, addressing challenges such as complex non-cooperative scenarios or varying climate conditions will remain pivotal in advancing UAV data-driven methodologies. The survey offers an intellectual cornerstone for researchers, guiding future avenues in UAV-based deep learning techniques.

Related Papers

Find Related Papers

Tweets

https://twitter.com/ninellven/status/1865645573816070635