ReDet: A Rotation-equivariant Detector for Aerial Object Detection (2103.07733v1)

Published 13 Mar 2021 in cs.CV

Abstract: Recently, object detection in aerial images has gained much attention in computer vision. Different from objects in natural images, aerial objects are often distributed with arbitrary orientation. Therefore, the detector requires more parameters to encode the orientation information, which are often highly redundant and inefficient. Moreover, as ordinary CNNs do not explicitly model the orientation variation, large amounts of rotation augmented data is needed to train an accurate object detector. In this paper, we propose a Rotation-equivariant Detector (ReDet) to address these issues, which explicitly encodes rotation equivariance and rotation invariance. More precisely, we incorporate rotation-equivariant networks into the detector to extract rotation-equivariant features, which can accurately predict the orientation and lead to a huge reduction of model size. Based on the rotation-equivariant features, we also present Rotation-invariant RoI Align (RiRoI Align), which adaptively extracts rotation-invariant features from equivariant features according to the orientation of RoI. Extensive experiments on several challenging aerial image datasets DOTA-v1.0, DOTA-v1.5 and HRSC2016, show that our method can achieve state-of-the-art performance on the task of aerial object detection. Compared with previous best results, our ReDet gains 1.2, 3.5 and 2.6 mAP on DOTA-v1.0, DOTA-v1.5 and HRSC2016 respectively while reducing the number of parameters by 60\% (313 Mb vs. 121 Mb). The code is available at: \url{https://github.com/csuhan/ReDet}.

Authors (4)

Jiaming Han (17 papers)
Jian Ding (132 papers)
Nan Xue (61 papers)
Gui-Song Xia (139 papers)

Citations (459)

View on Semantic Scholar

Summary

ReDet: A Rotation-equivariant Detector for Aerial Object Detection

The paper "ReDet: A Rotation-equivariant Detector for Aerial Object Detection" presents a methodological advancement in object detection within aerial images, focusing on challenges posed by arbitrary object orientations. Traditional Convolutional Neural Networks (CNNs) require enhanced parameters for accurate orientation encoding, which can be inefficient. This paper addresses these inefficiencies through the introduction of a Rotation-equivariant Detector (ReDet).

Key Contributions

The primary contributions of this work include:

Rotation-equivariant Networks Integration: The authors propose integrating rotation-equivariant networks into object detectors. This integration allows for the extraction of features that maintain their properties under rotation transformations, thereby leading to a reduction in model complexity. This is realized by implementing group convolutions that share weights across different orientations, ensuring efficient and effective orientation handling.
Rotation-invariant RoI Align (RiRoI Align): To further refine the robustness of object detection regardless of rotation, the RiRoI Align technique is introduced. It adapts features according to the orientation of the Region of Interest (RoI), ensuring invariance. This ensures not only spatial alignment but also orientation channel alignment, enabling more stable feature extraction.
Detailed Experimental Validation: The authors validated ReDet's efficacy on challenging datasets, such as DOTA-v1.0, DOTA-v1.5, and HRSC2016. The results showcase state-of-the-art performance with significant improvements in mean Average Precision (mAP) metrics — achieving enhancements of 1.2, 3.5, and 2.6 points respectively on the datasets considered.

Numerical and Performance Insights

Significant numerical results highlight that ReDet reduces the number of parameters by 60% (from 313 Mb to 121 Mb), contributing to computational efficiency. Furthermore, ReDet maintains competitive performance with reduced computational overhead due to enhanced weight-sharing mechanisms across orientations.

The experimental comparison on DOTA datasets illustrates that ReDet not only improves detection accuracy but also provides better model size vs. accuracy trade-offs compared to existing baselines. The architectural refinement linked with rotation handling renders ReDet particularly adept at managing and detecting objects with various orientations, a common trait amongst aerial images.

Implications and Future Directions

The implications of this research are multi-faceted:

Practical Impact: In real-world applications, such as surveillance and geospatial analysis, where aerial images are prevalent, ReDet provides a more efficient and accurate method for detecting arbitrary-oriented objects.
Theoretical Advancements: The introduction of rotation-equivariant architectures within the detection pipeline paves the way for further exploration in equivariant representations, potentially extending beyond object detection to other vision tasks.

Future developments could explore deeper networks or alternative group convolution configurations to enhance performance further. Additionally, integrating equivariant designs with other equivariance types, such as scaling, could yield sophisticated models capable of handling a broader range of transformations.

This paper presents a substantial contribution to object detection fields, specifically in environments characterized by arbitrary object orientations, and lays the groundwork for continued innovation in rotation equivariance within deep learning frameworks.

PDF Markdown

Related Papers

Find Related Papers

GitHub

GitHub - csuhan/ReDet: Official code of the paper "ReDet: A Rotation-Equivariant Detector for Aerial Object Detection" (CVPR 2021) (404 stars)