D3T: Distinctive Dual-Domain Teacher Zigzagging Across RGB-Thermal Gap for Domain-Adaptive Object Detection (2403.09359v1)
Abstract: Domain adaptation for object detection typically entails transferring knowledge from one visible domain to another visible domain. However, there are limited studies on adapting from the visible to the thermal domain, because the domain gap between the visible and thermal domains is much larger than expected, and traditional domain adaptation can not successfully facilitate learning in this situation. To overcome this challenge, we propose a Distinctive Dual-Domain Teacher (D3T) framework that employs distinct training paradigms for each domain. Specifically, we segregate the source and target training sets for building dual-teachers and successively deploy exponential moving average to the student model to individual teachers of each domain. The framework further incorporates a zigzag learning method between dual teachers, facilitating a gradual transition from the visible to thermal domains during training. We validate the superiority of our method through newly designed experimental protocols with well-known thermal datasets, i.e., FLIR and KAIST. Source code is available at https://github.com/EdwardDo69/D3T .
- Human detection in aerial thermal images using faster r-cnn and ssd algorithms. Electronics, 11(7):1151, 2022.
- Multi-modal gated mixture of local-to-global experts for dynamic image fusion. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 23555–23564, 2023a.
- Contrastive mean teacher for domain adaptive object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23839–23848, 2023b.
- Harmonizing transferability and discriminability for adapting object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 8869–8878, 2020.
- Domain adaptive faster r-cnn for object detection in the wild. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3339–3348, 2018.
- Multimodal object detection via probabilistic ensembling. In European Conference on Computer Vision, pages 139–158. Springer, 2022.
- The cityscapes dataset for semantic urban scene understanding. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3213–3223, 2016.
- Imagenet: A large-scale hierarchical image database. In 2009 IEEE conference on computer vision and pattern recognition, pages 248–255. Ieee, 2009.
- Unbiased mean teacher for cross-domain object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 4091–4101, 2021.
- Harmonious teacher for cross-domain object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23829–23838, 2023.
- The pascal visual object classes (voc) challenge. International journal of computer vision, 88:303–338, 2010.
- Thermal cameras and applications: a survey. Machine vision and applications, 25:245–262, 2014.
- Unsupervised domain adaptation by backpropagation. In International conference on machine learning, pages 1180–1189. PMLR, 2015.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
- FA Group et al. Flir thermal dataset for algorithm training, 2018.
- Ir reasoner: Real-time infrared object detection by visual reasoning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 422–430, 2023.
- Deep residual learning for image recognition. In IEEE Conf. on Computer Vision and Pattern Recognition, pages 770–778, 2016.
- Every pixel matters: Center-aware feature alignment for domain adaptive object detector. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part IX 16, pages 733–748. Springer, 2020.
- Multispectral pedestrian detection: Benchmark dataset and baseline. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 1037–1045, 2015.
- Cross-domain weakly-supervised object detection through progressive domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 5001–5009, 2018.
- Object detection using thermal imaging. In 2020 IEEE 17th India Council International Conference (INDICON), pages 1–6. IEEE, 2020.
- Object detection from uav thermal infrared images and videos using yolo models. International Journal of Applied Earth Observation and Geoinformation, 112:102912, 2022.
- Diversify and match: A domain adaptive representation learning paradigm for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12456–12465, 2019.
- Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, 25, 2012.
- Sigma: Semantic-complete graph matching for domain adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 5291–5300, 2022a.
- Cross-domain adaptive teacher for object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 7581–7590, 2022b.
- Microsoft coco: Common objects in context. In Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6-12, 2014, Proceedings, Part V 13, pages 740–755. Springer, 2014.
- Ssd: Single shot multibox detector. In Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pages 21–37. Springer, 2016.
- Unbiased teacher for semi-supervised object detection. arXiv preprint arXiv:2102.09480, 2021.
- Switching temporary teachers for semi-supervised semantic segmentation. Advances in Neural Information Processing Systems, 36, 2023.
- Few-shot adaptive object detection with cross-domain cutmix. In Proceedings of the Asian Conference on Computer Vision, pages 1350–1367, 2022.
- Night vision surveillance: Object detection using thermal and visible images. In 2020 International Conference for Emerging Technology (INCET), pages 1–6. IEEE, 2020.
- You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 779–788, 2016.
- Faster r-cnn: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, 2015.
- Strong-weak distribution alignment for adaptive object detection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 6956–6965, 2019.
- Semantic foggy scene understanding with synthetic data. International Journal of Computer Vision, 126:973–992, 2018.
- Mean teachers are better role models: Weight-averaged consistency targets improve semi-supervised deep learning results. Advances in Neural Information Processing Systems, 30, 2017.
- Fcos: Fully convolutional one-stage object detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9627–9636, 2019.
- Adversarial discriminative domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 7167–7176, 2017.
- Meta-uda: Unsupervised domain adaptive thermal object detection using meta-learning. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 1412–1423, 2022.
- Learning semantic representations for unsupervised domain adaptation. In International conference on machine learning, pages 5423–5432. PMLR, 2018.
- Cross-domain detection via graph-induced prototype alignment. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 12355–12364, 2020.
- Infrared image small-target detection based on improved fcos and spatio-temporal features. Electronics, 11(6):933, 2022.
- Multispectral fusion for object detection with cyclic fuse-and-refine blocks. In 2020 IEEE International conference on image processing (ICIP), pages 276–280. IEEE, 2020.
- Guided attentive feature fusion for multispectral pedestrian detection. In Proceedings of the IEEE/CVF winter conference on applications of computer vision, pages 72–80, 2021a.
- Weakly aligned cross-modal learning for multispectral pedestrian detection. In Proceedings of the IEEE/CVF international conference on computer vision, pages 5127–5137, 2019.
- Rpn prototype alignment for domain adaptive object detector. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 12425–12434, 2021b.
- Improving multispectral pedestrian detection by addressing modality imbalance problems. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVIII 16, pages 787–803. Springer, 2020.
- Dinh Phat Do (1 paper)
- Taehoon Kim (30 papers)
- Jaemin Na (9 papers)
- Jiwon Kim (51 papers)
- Keonho Lee (2 papers)
- Kyunghwan Cho (2 papers)
- Wonjun Hwang (18 papers)