Diffusing More Objects for Semi-Supervised Domain Adaptation with Less Labeling (2312.12000v1)
Abstract: For object detection, it is possible to view the prediction of bounding boxes as a reverse diffusion process. Using a diffusion model, the random bounding boxes are iteratively refined in a denoising step, conditioned on the image. We propose a stochastic accumulator function that starts each run with random bounding boxes and combines the slightly different predictions. We empirically verify that this improves detection performance. The improved detections are leveraged on unlabelled images as weighted pseudo-labels for semi-supervised learning. We evaluate the method on a challenging out-of-domain test set. Our method brings significant improvements and is on par with human-selected pseudo-labels, while not requiring any human involvement.
- Label-efficient semantic segmentation with diffusion models. In International Conference on Learning Representations, 2021.
- A survey on generative diffusion model. arXiv preprint arXiv:2209.02646, 2022.
- DiffusionDet Diffusion Model for Object Detection. Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV2023), 2023. URL https://github.com/ShoufaChen/.
- Mask r-cnn. In Proceedings of the IEEE international conference on computer vision, pages 2961–2969, 2017.
- Glenn Jocher. ultralytics/yolov5: v7.0 - YOLOv5 SOTA Realtime Instance Segmentation (v7.0). Https://Github.Com/Ultralytics/Yolov5/Tree/V7.0, page 10, 11 2022. doi: 10.5281/ZENODO.7347926. URL https://zenodo.org/record/7347926https://ultralytics.com/.
- Microsoft COCO: Common objects in context. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 8693 LNCS(PART 5):740–755, 2014. ISSN 16113349. doi: 10.1007/978-3-319-10602-1{_}48.
- Focal Loss for Dense Object Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42(2):318–327, 8 2017a. ISSN 19393539. doi: 10.1109/TPAMI.2018.2858826. URL https://arxiv.org/abs/1708.02002v2.
- Focal loss for dense object detection. In Proceedings of the IEEE international conference on computer vision, pages 2980–2988, 2017b.
- Deep Unsupervised Domain Adaptation: A Review of Recent Advances and Perspectives. APSIPA Transactions on Signal and Information Processing, 1:1–48, 2022.
- Unbiased Teacher for Semi-Supervised Object Detection. ICLR 2021, pages 1–17, 2 2021. URL https://github.com/facebookresearch/unbiased-teacher.https://arxiv.org/abs/2102.09480v1.
- Unified Deep Supervised Domain Adaptation and Generalization. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, pages 5716–5726. 2017.
- Unsupervised Domain Adaptation of Object Detectors: A Survey. 2021.
- A Simple Semi-Supervised Learning Framework for Object Detection. 5 2020. URL https://arxiv.org/abs/2005.04757v2.
- Deep visual domain adaptation: A survey. Neurocomputing, 312:135–153, 10 2018. ISSN 0925-2312. doi: 10.1016/J.NEUCOM.2018.05.083.
- Diffusion probabilistic modeling for video generation. arXiv preprint arXiv:2203.09481, 2022a.
- A Survey on Deep Semi-Supervised Learning. IEEE Transactions on Knowledge and Data Engineering, pages 1–20, 2022b. ISSN 1041-4347. doi: 10.1109/tkde.2022.3220219.
- Detection and Tracking Meet Drones Challenge. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(11):7380–7399, 1 2020a. ISSN 19393539. doi: 10.1109/TPAMI.2021.3119563. URL https://arxiv.org/abs/2001.06303v3.
- Deformable detr: Deformable transformers for end-to-end object detection. In International Conference on Learning Representations, 2020b.