Oriented R-CNN for Object Detection (2108.05699v1)

Published 12 Aug 2021 in cs.CV

Abstract: Current state-of-the-art two-stage detectors generate oriented proposals through time-consuming schemes. This diminishes the detectors' speed, thereby becoming the computational bottleneck in advanced oriented object detection systems. This work proposes an effective and simple oriented object detection framework, termed Oriented R-CNN, which is a general two-stage oriented detector with promising accuracy and efficiency. To be specific, in the first stage, we propose an oriented Region Proposal Network (oriented RPN) that directly generates high-quality oriented proposals in a nearly cost-free manner. The second stage is oriented R-CNN head for refining oriented Regions of Interest (oriented RoIs) and recognizing them. Without tricks, oriented R-CNN with ResNet50 achieves state-of-the-art detection accuracy on two commonly-used datasets for oriented object detection including DOTA (75.87% mAP) and HRSC2016 (96.50% mAP), while having a speed of 15.1 FPS with the image size of 1024$\times$1024 on a single RTX 2080Ti. We hope our work could inspire rethinking the design of oriented detectors and serve as a baseline for oriented object detection. Code is available at https://github.com/jbwang1997/OBBDetection.

Authors (5)

Xingxing Xie (5 papers)
Gong Cheng (78 papers)
Jiabao Wang (24 papers)
Xiwen Yao (9 papers)
Junwei Han (87 papers)

Citations (558)

View on Semantic Scholar

Summary

Overview of "Oriented R-CNN for Object Detection"

The paper "Oriented R-CNN for Object Detection" introduces a novel framework aimed at enhancing the efficiency and accuracy of oriented object detection, a crucial requirement in domains requiring precise localization of arbitrarily oriented objects. It addresses the computational challenges faced by existing state-of-the-art two-stage detectors, primarily stemming from their complex and time-consuming methods for generating oriented proposals.

Key Contributions

The primary contribution of this work is the Oriented R-CNN framework, which includes:

Oriented Region Proposal Network (Oriented RPN): This is a lightweight network designed to directly generate high-quality oriented proposals in a computationally efficient manner. It achieves this through a novel parameterization approach known as the midpoint offset representation, which uses six parameters to effectively represent arbitrary-oriented bounding boxes.
Oriented R-CNN Head: The second stage enhances the precision of region proposals and performs classification, leveraging rotated RoI alignment to maintain rotational invariance and ensure feature alignment with object orientation.

Numerical Results

The framework achieves significant numerical benchmarks:

On the DOTA dataset, it achieves a mean average precision (mAP) of 75.87% using a ResNet-50 backbone and 76.28% with ResNet-101, outperforming various contemporary methods.
On the HRSC2016 dataset, it achieves 96.50% mAP with ResNet-50, which is competitively accurate.
The framework operates at a speed of 15.1 FPS on a standard RTX 2080Ti, indicating its operational efficiency.

Strong Claims

The paper claims that Oriented R-CNN offers a significant leap in efficiency for proposal generation without compromising on detection accuracy. The Oriented RPN drastically reduces computational costs, which is demonstrated by having approximately 1/3000 the number of parameters compared to the RoI Transformer $+$ and 1/15 compared to the rotated RPN.

Implications

The implications of this research are noteworthy. Practically, it allows for faster processing of object detection tasks in real-world scenarios where orientation normalization of objects matters, such as aerial imagery and autonomous navigation. Theoretically, it opens new avenues for reducing computational bottlenecks in multi-stage detectors, potentially influencing further research into efficient network design for complex tasks.

Future Prospects

For future research, the work is poised to serve as a baseline for evaluating oriented object detectors, encouraging exploration into improved representation schemes and more efficient network architectures. There is potential for integration with more advanced backbone networks and adaptation for other modalities beyond image data.

In summary, the Oriented R-CNN framework introduces valuable advancements in achieving efficient and precise oriented object detection. Its robust and efficient approach presents practical benefits, stimulating further investigation into optimizing object detection frameworks. The publicly available code also facilitates continued development and adaptation by the broader research community.

PDF Markdown

Related Papers

Find Related Papers