Oriented Objects as pairs of Middle Lines (1912.10694v3)

Published 23 Dec 2019 in cs.CV

Abstract: The detection of oriented objects is frequently appeared in the field of natural scene text detection as well as object detection in aerial images. Traditional detectors for oriented objects are common to rotate anchors on the basis of the RCNN frameworks, which will multiple the number of anchors with a variety of angles, coupled with rotating NMS algorithm, the computational complexities of these models are greatly increased. In this paper, we propose a novel model named Oriented Objects Detection Network O^2-DNet to detect oriented objects by predicting a pair of middle lines inside each target. O^2-DNet is an one-stage, anchor-free and NMS-free model. The target line segments of our model are defined as two corresponding middle lines of original rotating bounding box annotations which can be transformed directly instead of additional manual tagging. Experiments show that our O^2-DNet achieves excellent performance on ICDAR 2015 and DOTA datasets. It is noteworthy that the objects in COCO can be regard as a special form of oriented objects with an angle of 90 degrees. O^2-DNet can still achieve competitive results in these general natural object detection datasets.

Citations (208)

View on Semantic Scholar

Summary

The paper introduces O²-DNet, a novel anchor-free and NMS-free network that detects oriented objects by predicting pairs of middle lines within each object, offering improved efficiency over traditional methods.
O²-DNet utilizes a unique Middle Lines Detection approach with a specialized Line Loss function and a drift region concept to accurately represent object orientation and position.
Experimental results show O²-DNet achieves competitive or superior performance on standard datasets like ICDAR 2015, DOTA, and COCO, demonstrating its versatility for various oriented object detection tasks.

An Overview of "Oriented Objects as Pairs of Middle Lines"

The research paper presents a novel approach to detect oriented objects more efficiently in fields like natural scene text detection and aerial image analysis. Traditionally, detecting such oriented objects relies heavily on rotating anchors within region-based CNN frameworks, which involve increased computational complexity due to handling multiple angles. This paper introduces the Oriented Objects Detection Network (O²-DNet), an innovative one-stage, anchor-free, and NMS-free model. O²-DNet predicts the orientation of objects by detecting a pair of middle lines within each object, circumventing the computationally intensive need for rotating anchors and NMS algorithms.

Key Methodology

Middle Lines Detection:
- O²-DNet identifies oriented objects through the prediction of two intersecting middle lines rather than conventional bounding boxes. These lines intersect at the center of the object, with the endpoints regressed relative to this center point, allowing for an efficient detection system that simplifies the model's structure significantly.
Anchor-Free Strategy:
- The model's anchor-free nature permits it to bypass the computational overhead associated with setting multiple anchor boxes at different angles, a typical necessity in conventional oriented object detection frameworks.
Line Loss Function:
- A Line Loss function is introduced to control the position, orientation, and the relationship between the detected middle lines. This function includes parallelism and vertical constraints that ensure the accurate representation of oriented objects by the model.
Drift Region for Intersection Points:
- To stabilize detection, the concept of a drift region is used, which allows minor deviations in the predicted intersection point without impacting the final bounding box accuracy. This mechanism mitigates the potential for detection errors caused by slight inaccuracies in intersection point prediction.

Experimental Evaluation

The paper reports substantial performance improvements when evaluated on multiple datasets, namely, ICDAR 2015 for text detection, DOTA for aerial images, and COCO for general object detection. The results highlight that O²-DNet achieves competitive results across datasets:

ICDAR 2015: It outperforms several traditional methods with an F1 score of 82.97%, showcasing its effectiveness in detecting irregularly shaped text in natural scenes.
DOTA: The model achieves a mean Average Precision (mAP) of 71.04%, excelling in scenarios with objects of varying orientations, sizes, and densities.
COCO: Even in datasets with predominantly horizontal bounding boxes, O²-DNet maintains a robust mAP of 41.3%, validating its versatility and adaptability across different detection domains.

Implications and Future Research

The O²-DNet architecture represents a significant advancement in oriented object detection, providing a unified framework capable of handling varying object orientations more efficiently. The introduction of middle line detection as an object-oriented strategy could inspire further investigations into similar alternative representations for object detection tasks.

In practical terms, O²-DNet could streamline computations in satellite imaging and autonomous navigation systems, where real-time processing and adaptability to diverse object orientations are paramount. Future developments could explore enhancements in the model's architectural complexity to further increase its precision and reduce computation time without impacting accuracy.

This work lays a promising foundation for ongoing research into efficient object detection technologies, especially within domains requiring high precision and low computational demand like remote sensing and real-time scene analysis.