DQ-DETR: DETR with Dynamic Query for Tiny Object Detection (2404.03507v6)
Abstract: Despite previous DETR-like methods having performed successfully in generic object detection, tiny object detection is still a challenging task for them since the positional information of object queries is not customized for detecting tiny objects, whose scale is extraordinarily smaller than general objects. Also, DETR-like methods using a fixed number of queries make them unsuitable for aerial datasets, which only contain tiny objects, and the numbers of instances are imbalanced between different images. Thus, we present a simple yet effective model, named DQ-DETR, which consists of three different components: categorical counting module, counting-guided feature enhancement, and dynamic query selection to solve the above-mentioned problems. DQ-DETR uses the prediction and density maps from the categorical counting module to dynamically adjust the number of object queries and improve the positional information of queries. Our model DQ-DETR outperforms previous CNN-based and DETR-like methods, achieving state-of-the-art mAP 30.2% on the AI-TOD-V2 dataset, which mostly consists of tiny objects. Our code will be available at https://github.com/hoiliu-0801/DQ-DETR.
- Cascade r-cnn: Delving into high quality object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- End-to-end object detection with transformers. In ECCV, pages 213–229, 2020.
- Dynamic detr: End-to-end object detection with dynamic attention. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 2968–2977, 2021.
- Adaptive sparse convolutional networks with global context enhancement for faster object detection on drone images. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13435–13444. IEEE Computer Society, 2023.
- YOLOv5: SOTA Realtime Instance Segmentation, 2022.
- Augmentation for small object detection. arXiv preprint arXiv:1902.07296, 2019.
- Dn-detr: Accelerate detr training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13619–13627, 2022.
- Feature pyramid networks for object detection. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 936–944, 2017.
- Focal loss for dense object detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 2020.
- DAB-DETR: Dynamic anchor boxes are better queries for DETR. In International Conference on Learning Representations, 2022.
- Scale decoupled pyramid for object detection in aerial images. IEEE Transactions on Geoscience and Remote Sensing, 61:1–14, 2023.
- Conditional detr for fast training convergence. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2021.
- Localization recall precision (lrp): A new performance metric for object detection. In Proceedings of the European Conference on Computer Vision (ECCV), pages 504–519, 2018.
- Detectors: Detecting objects with recursive feature pyramid and switchable atrous convolution. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10213–10224, 2021.
- Yolov3: An incremental improvement. arXiv preprint arXiv:1804.02767, 2018.
- Faster r-cnn: Towards real-time object detection with region proposal networks. Advances in neural information processing systems, 28, 2015.
- Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 658–666, 2019.
- Rethinking transformer-based set prediction for object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 3611–3620, 2021.
- Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2019.
- A normalized gaussian wasserstein distance for tiny object detection. arXiv preprint arXiv:2110.13389, 2021a.
- Tiny object detection in aerial images. In ICPR, pages 3791–3798, 2021b.
- Anchor detr: Query design for transformer-based detector. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 2567–2575, 2022.
- Dot distance for tiny object detection in aerial images. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 1192–1201, 2021.
- Detecting tiny objects in aerial images: A normalized wasserstein distance and a new benchmark. ISPRS Journal of Photogrammetry and Remote Sensing, 190:79–93, 2022a.
- Rfla: Gaussian receptive field based label assignment for tiny object detection. In European conference on computer vision, pages 526–543. Springer, 2022b.
- Dino: Detr with improved denoising anchor boxes for end-to-end object detection, 2022.
- Detection and tracking meet drones challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):7380–7399, 2021a.
- Deformable detr: Deformable transformers for end-to-end object detection. In International Conference on Learning Representations, 2021b.
- Learning data augmentation strategies for object detection. In European conference on computer vision, pages 566–583. Springer, 2020.
- Yi-Xin Huang (1 paper)
- Hou-I Liu (7 papers)
- Hong-Han Shuai (56 papers)
- Wen-Huang Cheng (40 papers)