Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss (2101.11952v4)

Published 28 Jan 2021 in cs.CV and cs.AI

Abstract: Boundary discontinuity and its inconsistency to the final detection metric have been the bottleneck for rotating detection regression loss design. In this paper, we propose a novel regression loss based on Gaussian Wasserstein distance as a fundamental approach to solve the problem. Specifically, the rotated bounding box is converted to a 2-D Gaussian distribution, which enables to approximate the indifferentiable rotational IoU induced loss by the Gaussian Wasserstein distance (GWD) which can be learned efficiently by gradient back-propagation. GWD can still be informative for learning even there is no overlapping between two rotating bounding boxes which is often the case for small object detection. Thanks to its three unique properties, GWD can also elegantly solve the boundary discontinuity and square-like problem regardless how the bounding box is defined. Experiments on five datasets using different detectors show the effectiveness of our approach. Codes are available at https://github.com/yangxue0827/RotationDetection and https://github.com/open-mmlab/mmrotate.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xue Yang (141 papers)
  2. Junchi Yan (241 papers)
  3. Qi Ming (8 papers)
  4. Wentao Wang (47 papers)
  5. Xiaopeng Zhang (100 papers)
  6. Qi Tian (314 papers)
Citations (343)

Summary

Rethinking Rotated Object Detection with Gaussian Wasserstein Distance Loss

The paper presents a novel approach to rotated object detection by introducing a regression loss based on Gaussian Wasserstein Distance (GWD). This method addresses key challenges associated with traditional angle regression models, offering a more effective solution for arbitrary-oriented object detection. The researchers posit that conventional models grapple with issues such as boundary discontinuity, metric-loss inconsistency, and difficulties with square-like problems.

Key Contributions

  1. Gaussian Wasserstein Distance for Rotated Bounding Boxes: The authors propose converting the rotated bounding box into a two-dimensional Gaussian distribution. This transformation allows the utilization of Gaussian Wasserstein Distance, which provides a differentiable loss that aligns well with the detection accuracy metric. The approach elegantly resolves boundary discontinuities and the square-like problem inherent in traditional methods.
  2. Theoretical and Empirical Justification: The paper provides a thorough mathematical formulation and proof of the effectiveness of using GWD as a regression loss. The proposed loss demonstrates improved consistency with the Intersection over Union (IoU) metric while maintaining differentiation properties, facilitating efficient learning via back-propagation.
  3. Experimental Validation: Extensive experimentation across five datasets, including aerial and scene text images, evidences the efficacy of the GWD-based approach. The results exhibit significant performance gains over state-of-the-art methods, particularly in challenging scenarios involving arbitrary angles and small object detection.
  4. Unified Solution for Rotated Object Detection: The proposed method claims to simplify the choice of bounding box definitions, effectively treating different parameterizations equivalently. This uniformity enhances the method's robustness across diverse datasets and detection scenarios.

Numerical Results and Implications

The experiments on datasets such as DOTA, HRSC2016, and ICDAR benchmarks demonstrate notable improvements in performance metrics. For instance, the proposed method achieves an mAP improvement of up to 3.20% on the DOTA dataset compared to traditional smooth L1 loss. This substantial performance enhancement highlights the practical utility of the GWD-based loss in real-world applications.

Implications for Future Research and Development

The introduction of GWD in rotated object detection paves the way for more refined regression models capable of handling complex object orientations and scales. The paper suggests that this methodology could be extended to other areas where object orientation is crucial, including more advanced 3D object detection and segmentation tasks.

The work invites further research into the applicability of Gaussian-based representations in various computer vision tasks. Future investigations might explore hybrid approaches that combine GWD with other advanced loss functions to further improve detection accuracy and computational efficiency.

In summary, the paper contributes a theoretically sound and empirically validated framework for improving rotated object detection tasks across a wide range of applications. By resolving critical regression issues, the GWD-based approach sets a new standard for angle-agnostic object detection methodologies.