CLIP-Guided Source-Free Object Detection in Aerial Images (2401.05168v2)
Abstract: Domain adaptation is crucial in aerial imagery, as the visual representation of these images can significantly vary based on factors such as geographic location, time, and weather conditions. Additionally, high-resolution aerial images often require substantial storage space and may not be readily accessible to the public. To address these challenges, we propose a novel Source-Free Object Detection (SFOD) method. Specifically, our approach begins with a self-training framework, which significantly enhances the performance of baseline methods. To alleviate the noisy labels in self-training, we utilize Contrastive Language-Image Pre-training (CLIP) to guide the generation of pseudo-labels, termed CLIP-guided Aggregation (CGA). By leveraging CLIP's zero-shot classification capability, we aggregate its scores with the original predicted bounding boxes, enabling us to obtain refined scores for the pseudo-labels. To validate the effectiveness of our method, we constructed two new datasets from different domains based on the DIOR dataset, named DIOR-C and DIOR-Cloudy. Experimental results demonstrate that our method outperforms other comparative algorithms. The code is available at https://github.com/Lans1ng/SFOD-RS.
- “Afdet: Toward more accurate and faster object detection in remote sensing images,” IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., vol. 14, pp. 12557–12568, 2021.
- “Transformation-invariant network for few-shot object detection in remote-sensing images,” IEEE Trans. Geosci. Remote Sens., vol. 61, pp. 1–14, 2023.
- “Object detection in optical remote sensing images: A survey and a new benchmark,” ISPRS J. Photogramm. Remote Sens., vol. 159, pp. 296–307, 2020.
- “Oriented r-cnn for object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2021, pp. 3520–3529.
- “Instance relation graph guided source-free domain adaptive object detection,” in Proc. IEEE Conf. Comput. Vis. Pattern Recognit., 2023, pp. 3520–3530.
- “Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering regularized self-training,” arXiv preprint arXiv:2303.10856, 2023.
- “Towards real-world test-time adaptation: Tri-net self-training with balanced normalization,” Proc. AAAI Conf. Artif. Intell., 2024.
- “Unbiased teacher for semi-supervised object detection,” Proc. Int. Conf. Learn. Represent., 2021.
- “Learning transferable visual models from natural language supervision,” in Proc. Int. Conf. Mach. Learn. PMLR, 2021, pp. 8748–8763.
- “Few-shot object detection via variational feature aggregation,” in Proc. AAAI Conf. Artif. Intell., 2023.
- “Revisiting realistic test-time training: Sequential inference and adaptation by anchored clustering,” Proc. Adv. Neural Inf. Process. Syst., vol. 35, pp. 17543–17555, 2022.
- “On the robustness of object detection models in aerial images,” 2023.
- “Tent: Fully test-time adaptation by entropy minimization,” Proc. Int. Conf. Learn. Represent., 2021.
- “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in Proc. Int. Conf. Mach. Learn. pmlr, 2015, pp. 448–456.
- “Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation,” in Proc. Int. Conf. Mach. Learn. PMLR, 2020, pp. 6028–6039.
- “Benchmarking neural network robustness to common corruptions and perturbations,” Proc. Int. Conf. Learn. Represent., 2019.
Collections
Sign up for free to add this paper to one or more collections.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.