Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Oriented Object Detection with Transformer (2106.03146v1)

Published 6 Jun 2021 in cs.CV

Abstract: Object detection with Transformers (DETR) has achieved a competitive performance over traditional detectors, such as Faster R-CNN. However, the potential of DETR remains largely unexplored for the more challenging task of arbitrary-oriented object detection problem. We provide the first attempt and implement Oriented Object DEtection with TRansformer ($\bf O2DETR$) based on an end-to-end network. The contributions of $\rm O2DETR$ include: 1) we provide a new insight into oriented object detection, by applying Transformer to directly and efficiently localize objects without a tedious process of rotated anchors as in conventional detectors; 2) we design a simple but highly efficient encoder for Transformer by replacing the attention mechanism with depthwise separable convolution, which can significantly reduce the memory and computational cost of using multi-scale features in the original Transformer; 3) our $\rm O2DETR$ can be another new benchmark in the field of oriented object detection, which achieves up to 3.85 mAP improvement over Faster R-CNN and RetinaNet. We simply fine-tune the head mounted on $\rm O2DETR$ in a cascaded architecture and achieve a competitive performance over SOTA in the DOTA dataset.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Teli Ma (22 papers)
  2. Mingyuan Mao (6 papers)
  3. Honghui Zheng (2 papers)
  4. Peng Gao (402 papers)
  5. Xiaodi Wang (15 papers)
  6. Shumin Han (18 papers)
  7. Errui Ding (156 papers)
  8. Baochang Zhang (113 papers)
  9. David Doermann (54 papers)
Citations (38)

Summary

We haven't generated a summary for this paper yet.