Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Learning a Rotation Invariant Detector with Rotatable Bounding Box (1711.09405v1)

Published 26 Nov 2017 in cs.CV

Abstract: Detection of arbitrarily rotated objects is a challenging task due to the difficulties of locating the multi-angle objects and separating them effectively from the background. The existing methods are not robust to angle varies of the objects because of the use of traditional bounding box, which is a rotation variant structure for locating rotated objects. In this article, a new detection method is proposed which applies the newly defined rotatable bounding box (RBox). The proposed detector (DRBox) can effectively handle the situation where the orientation angles of the objects are arbitrary. The training of DRBox forces the detection networks to learn the correct orientation angle of the objects, so that the rotation invariant property can be achieved. DRBox is tested to detect vehicles, ships and airplanes on satellite images, compared with Faster R-CNN and SSD, which are chosen as the benchmark of the traditional bounding box based methods. The results shows that DRBox performs much better than traditional bounding box based methods do on the given tasks, and is more robust against rotation of input image and target objects. Besides, results show that DRBox correctly outputs the orientation angles of the objects, which is very useful for locating multi-angle objects efficiently. The code and models are available at https://github.com/liulei01/DRBox.

Citations (183)

Summary

  • The paper introduces DRBox, a novel method that replaces traditional bounding boxes with rotatable ones to handle arbitrary object orientations.
  • It integrates angle estimation into a multi-layer convolutional network, achieving improved detection accuracy and speed.
  • Experiments on satellite imagery demonstrate DRBox outperforms traditional detectors with higher BEP, AP, and mAP at 70-80 fps.

Learning a Rotation Invariant Detector with Rotatable Bounding Box

The detection of arbitrarily rotated objects poses a significant challenge in computer vision, particularly in applications involving aerial and satellite imagery. Traditional bounding boxes (BBoxes), being rotation variant, often suffer from inefficiencies when tasked with identifying objects in varying orientations. The paper introduces a novel method for object detection using a rotatable bounding box (RBox), aiming to enhance robustness and accuracy in detecting rotated objects.

Approach

The proposed method, Detector using RBox (DRBox), employs RBoxes instead of BBoxes to effectively encapsulate and recognize objects with arbitrary orientations. An RBox is parameterized by five variables: the center coordinates, width, height, and orientation angle, allowing it to more accurately reflect the true geometry of the detected objects.

The core innovation lies in integrating rotation information within the bounding structure itself, thereby providing enhanced discrimination between objects and background pixels, and improving the separation of densely packed objects. The DRBox methodology extends SSD training procedures to accommodate angle estimation, reinforcing the accurate learning of objects’ orientations.

Methodology

The DRBox network architecture is a convolutional setup utilizing multi-layer convolution networks, with multi-angle prior RBoxes playing a critical role. These RBoxes rotate at various angles, allowing the network to cover a comprehensive range of object orientations. During training, ArIoU (angle-related IoU) is introduced to assess the overlap and angular alignment between predicted RBoxes and ground truths, leading to more precise angle estimation.

Experimentation reveals that DRBox networks, tailored separately for ship, vehicle, and airplane detection, exhibit substantial improvements over traditional BBox methods. The network achieves high efficiency, with a processing speed of 70-80 fps on NVIDIA GTX 1080Ti and Intel Core i7.

Numerical Results

The paper presents extensive testing and results comparing DRBox with Faster R-CNN and SSD, using a custom-built satellite image dataset. DRBox displays superior performance, consistently achieving higher BEP, AP, and mAP in object detection across various scenarios. The method shows remarkable robustness to rotation, as quantified by low STD_AP and STD_AS values, indicating consistent detections regardless of image or object orientation.

Implications

The implications of DRBox extend towards improving remote sensing applications where object orientation varies greatly, such as in urban surveillance, maritime monitoring, and infrastructure analysis. The integration of orientation as a learned feature offers a promising direction for further research, potentially benefiting other computer vision tasks involving orientation sensitivity.

Future Work

Future developments could explore the application of RBox within proposal-based frameworks like R-FCN or Faster R-CNN, potentially extending the benefits of rotation invariant detection across a broader range of detection models. Moreover, the detailed training approach could be generalized to enhance robustness across different environmental conditions and object classes.

In conclusion, DRBox offers a substantial step forward in detecting rotated objects, providing both practical application benefits and a foundation for future explorations in rotation-aware computer vision methodologies.