Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
149 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Sparse Semi-DETR: Sparse Learnable Queries for Semi-Supervised Object Detection (2404.01819v1)

Published 2 Apr 2024 in cs.CV

Abstract: In this paper, we address the limitations of the DETR-based semi-supervised object detection (SSOD) framework, particularly focusing on the challenges posed by the quality of object queries. In DETR-based SSOD, the one-to-one assignment strategy provides inaccurate pseudo-labels, while the one-to-many assignments strategy leads to overlapping predictions. These issues compromise training efficiency and degrade model performance, especially in detecting small or occluded objects. We introduce Sparse Semi-DETR, a novel transformer-based, end-to-end semi-supervised object detection solution to overcome these challenges. Sparse Semi-DETR incorporates a Query Refinement Module to enhance the quality of object queries, significantly improving detection capabilities for small and partially obscured objects. Additionally, we integrate a Reliable Pseudo-Label Filtering Module that selectively filters high-quality pseudo-labels, thereby enhancing detection accuracy and consistency. On the MS-COCO and Pascal VOC object detection benchmarks, Sparse Semi-DETR achieves a significant improvement over current state-of-the-art methods that highlight Sparse Semi-DETR's effectiveness in semi-supervised object detection, particularly in challenging scenarios involving small or partially obscured objects.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. Active cost-aware labeling of streaming data. In International Conference on Artificial Intelligence and Statistics, pages 9117–9136. PMLR, 2023.
  2. End-to-end object detection with transformers. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part I, pages 213–229. Springer, 2020.
  3. Dense learning based semi-supervised object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 4805–4814, 2022.
  4. MMDetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
  5. Recurrent glimpse-based decoder for detection with transformer. CoRR, abs/2112.04632, 2021.
  6. Up-detr: Unsupervised pre-training for object detection with transformers. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1601–1610, 2020.
  7. The pascal visual object classes challenge: A retrospective. International journal of computer vision, 111:98–136, 2015.
  8. You only look at one sequence: Rethinking transformer in vision through object detection. CoRR, abs/2106.00666, 2021.
  9. Fast convergence of DETR with spatially modulated co-attention. CoRR, abs/2101.07448, 2021.
  10. Ross Girshick. Fast r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2015.
  11. Pars: Pseudo-label aware robust sample selection for learning with noisy labels. arXiv preprint arXiv:2201.10836, 2022.
  12. Zero-reference deep curve estimation for low-light image enhancement. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1780–1789, 2020.
  13. Scale-equivalent distillation for semi-supervised object detection. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 14502–14511, 2022.
  14. Mask r-cnn. In Proceedings of the IEEE International Conference on Computer Vision (ICCV), 2017.
  15. Pseudoprop: Robust pseudo-label generation for semi-supervised object detection in autonomous driving systems. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4390–4398, 2022.
  16. Consistency-based semi-supervised learning for object detection. In Neural Information Processing Systems, 2019.
  17. Detrs with hybrid matching. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 19702–19712, 2022.
  18. Revisiting class imbalance for end-to-end semi-supervised object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4569–4578, 2023.
  19. Dn-detr: Accelerate detr training by introducing query denoising. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 13619–13627, 2022a.
  20. Pseco: Pseudo labeling and consistency training for semi-supervised object detection. In Computer Vision – ECCV 2022, pages 457–472, Cham, 2022b. Springer Nature Switzerland.
  21. Important object identification with semi-supervised learning for autonomous driving. In 2022 International Conference on Robotics and Automation (ICRA), pages 2913–2919. IEEE, 2022c.
  22. Microsoft COCO: common objects in context. CoRR, abs/1405.0312, 2014.
  23. Feature pyramid networks for object detection. CoRR, abs/1612.03144, 2016.
  24. Focal loss for dense object detection. CoRR, abs/1708.02002, 2017.
  25. Wb-detr: Transformer-based detector without backbone. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 2959–2967, 2021a.
  26. DAB-DETR: dynamic anchor boxes are better queries for DETR. CoRR, abs/2201.12329, 2022a.
  27. Ssd: Single shot multibox detector. In Computer Vision – ECCV 2016, pages 21–37, Cham, 2016. Springer International Publishing.
  28. Unbiased teacher for semi-supervised object detection. In Proceedings of the International Conference on Learning Representations (ICLR), 2021b.
  29. Unbiased teacher v2: Semi-supervised object detection for anchor-free and anchor-based detectors, 2022b.
  30. Conditional DETR for fast training convergence. CoRR, abs/2108.06152, 2021.
  31. Adapting object size variance and class imbalance for semi-supervised object detection. In AAAI Conference on Artificial Intelligence, 2023.
  32. Automated detection and segmentation of hbms in 3d x-ray images using semi-supervised deep learning. In 2022 IEEE 72nd Electronic Components and Technology Conference (ECTC), pages 1890–1897, 2022.
  33. Evaluating the prediction bias induced by label imbalance in multi-label classification. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, page 3368–3372, New York, NY, USA, 2021. Association for Computing Machinery.
  34. Yolov3: An incremental improvement. CoRR, abs/1804.02767, 2018.
  35. You only look once: Unified, real-time object detection. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 779–788, Los Alamitos, CA, USA, 2016. IEEE Computer Society.
  36. Faster R-CNN: towards real-time object detection with region proposal networks. IEEE Trans. Pattern Anal. Mach. Intell., 39(6):1137–1149, 2017.
  37. Generalized intersection over union: A metric and A loss for bounding box regression. CoRR, abs/1902.09630, 2019.
  38. Sparse DETR: efficient end-to-end object detection with learnable sparsity. CoRR, abs/2111.14330, 2021.
  39. Claudio Filipi Gonçalves Dos Santos and João Paulo Papa. Avoiding overfitting: A survey on regularization methods for convolutional neural networks. ACM Comput. Surv., 54(10s), 2022.
  40. Object detection with transformers: A review, 2023.
  41. A simple semi-supervised learning framework for object detection. CoRR, abs/2005.04757, 2020.
  42. Sparse R-CNN: end-to-end object detection with learnable proposals. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pages 14454–14463. Computer Vision Foundation / IEEE, 2021.
  43. Humble teachers teach better students for semi-supervised object detection. CoRR, abs/2106.10456, 2021.
  44. FCOS: fully convolutional one-stage object detection. CoRR, abs/1904.01355, 2019.
  45. Attention is all you need. In Advances in Neural Information Processing Systems. Curran Associates, Inc., 2017.
  46. Focalmix: Semi-supervised learning for 3d medical image detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3951–3960, 2020.
  47. Double-check soft teacher for semi-supervised object detection. In Proceedings of the Thirty-First International Joint Conference on Artificial Intelligence, IJCAI-22, pages 1430–1436. International Joint Conferences on Artificial Intelligence Organization, 2022a. Main Track.
  48. Omni-detr: Omni-supervised object detection with transformers. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 9367–9376, 2022b.
  49. Pnp-detr: Towards efficient visual analysis with transformers. CoRR, abs/2109.07036, 2021.
  50. FP-DETR: Detection transformer advanced by fully pre-training. In International Conference on Learning Representations, 2022c.
  51. Consistent-teacher: Towards reducing inconsistent pseudo-targets in semi-supervised object detection. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3240–3249, Los Alamitos, CA, USA, 2023. IEEE Computer Society.
  52. Self-training with noisy student improves imagenet classification. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 10684–10695, 2020.
  53. End-to-end semi-supervised object detection with soft teacher. CoRR, abs/2106.09018, 2021.
  54. Interactive self-training with mean teachers for semi-supervised object detection. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 5937–5946, 2021.
  55. Towards efficient and scale-robust ultra-high-definition image demoiréing. In European Conference on Computer Vision, pages 646–662. Springer, 2022.
  56. mixup: Beyond empirical risk minimization. ArXiv, abs/1710.09412, 2017.
  57. DINO: DETR with improved denoising anchor boxes for end-to-end object detection. In The Eleventh International Conference on Learning Representations, 2023a.
  58. Semi-detr: Semi-supervised object detection with detection transformers. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23809–23818, 2023b.
  59. Dense teacher: Dense pseudo-labels for semi-supervised object detection, 2022.
  60. Instant-teaching: An end-to-end semi-supervised object detection framework. CoRR, abs/2103.11402, 2021.
  61. Deformable {detr}: Deformable transformers for end-to-end object detection. In International Conference on Learning Representations, 2021.
Citations (8)

Summary

We haven't generated a summary for this paper yet.