Combining inherent knowledge of vision-language models with unsupervised domain adaptation through strong-weak guidance (2312.04066v4)
Abstract: Unsupervised domain adaptation (UDA) tries to overcome the tedious work of labeling data by leveraging a labeled source dataset and transferring its knowledge to a similar but different target dataset. Meanwhile, current vision-LLMs exhibit remarkable zero-shot prediction capabilities. In this work, we combine knowledge gained through UDA with the inherent knowledge of vision-LLMs. We introduce a strong-weak guidance learning scheme that employs zero-shot predictions to help align the source and target dataset. For the strong guidance, we expand the source dataset with the most confident samples of the target dataset. Additionally, we employ a knowledge distillation loss as weak guidance. The strong guidance uses hard labels but is only applied to the most confident predictions from the target dataset. Conversely, the weak guidance is employed to the whole dataset but uses soft labels. The weak guidance is implemented as a knowledge distillation loss with (shifted) zero-shot predictions. We show that our method complements and benefits from prompt adaptation techniques for vision-LLMs. We conduct experiments and ablation studies on three benchmarks (OfficeHome, VisDA, and DomainNet), outperforming state-of-the-art methods. Our ablation studies further demonstrate the contributions of different components of our algorithm.
- Contrastive test-time adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 295–305, 2022a.
- Multi-prompt alignment for multi-source unsupervised domain adaptation. arXiv preprint arXiv:2209.15210, 2022b.
- Cross-domain gradient discrepancy minimization for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3937–3946, 2021.
- Data filtering networks. arXiv preprint arXiv:2309.17425, 2023.
- Domain-adversarial training of neural networks. The journal of machine learning research, 17(1):2096–2030, 2016.
- Visual prompt tuning for test-time domain adaptation. arXiv preprint arXiv:2210.04831, 2022.
- Domain adaptation via prompt learning. arXiv preprint arXiv:2202.06687, 2022.
- Distilling the knowledge in a neural network. arXiv preprint arXiv:1503.02531, 2015.
- Scaling up visual and vision-language representation learning with noisy text supervision. In International conference on machine learning, pages 4904–4916. PMLR, 2021.
- Contrastive adaptation network for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 4893–4902, 2019.
- A broad study of pre-training for domain generalization and adaptation. In ECCV, pages 621–638. Springer, 2022.
- Probabilistic contrastive learning for domain adaptation. arXiv preprint arXiv:2111.06021, 2021a.
- Semantic concentration for domain adaptation. In Proceedings of the IEEE/CVF international conference on computer vision, pages 9102–9111, 2021b.
- Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In International Conference on Machine Learning, pages 6028–6039. PMLR, 2020.
- Conditional adversarial domain adaptation. In NeurIPS, pages 1647–1657, 2018.
- Stochastic classifiers for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 9111–9120, 2020.
- Contrastive vicinal space for unsupervised domain adaptation. In European Conference on Computer Vision, pages 92–110. Springer, 2022.
- Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.
- Moment matching for multi-source domain adaptation. In ICCV, pages 1406–1415, 2019.
- Combined scaling for zero-shot transfer learning. Neurocomputing, 555:126658, 2023.
- Sentry: Selective entropy optimization via committee consistency for unsupervised domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 8558–8567, 2021.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pages 8748–8763. PMLR, 2021.
- A closer look at smoothness in domain adversarial training. In International Conference on Machine Learning, pages 18378–18399. PMLR, 2022.
- Maximum classifier discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 3723–3732, 2018.
- Ad-clip: Adapting domains in prompt space using clip. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 4355–4364, 2023.
- Safe self-refinement for transformer-based domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 7191–7200, 2022.
- Unsupervised domain adaptation via structurally regularized deep clustering. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 8725–8735, 2020.
- Deep hashing network for unsupervised domain adaptation. In CVPR, pages 5018–5027, 2017.
- Backprop induced feature weighting for adversarial domain adaptation with iterative label distribution alignment. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 392–401, 2023a.
- Gradual source domain expansion for unsupervised domain adaptation. arXiv preprint arXiv:2311.09599, 2023b.
- Dynamic weighted learning for unsupervised domain adaptation. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 15242–15251, 2021.
- Learning semantic representations for unsupervised domain adaptation. In International conference on machine learning, pages 5423–5432. PMLR, 2018.
- Cdtrans: Cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165, 2021.
- Tvt: Transferable vision transformer for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 520–530, 2023.
- The unreasonable effectiveness of large language-vision models for source-free video domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 10307–10317, 2023.
- Lit: Zero-shot transfer with locked-image text tuning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 18123–18133, 2022.
- Bridging theory and algorithm for domain adaptation. In International Conference on Machine Learning, pages 7404–7413. PMLR, 2019.
- A review of single-source deep unsupervised visual domain adaptation. IEEE Transactions on Neural Networks and Learning Systems, 33(2):473–493, 2020.
- Learning to prompt for vision-language models. International Journal of Computer Vision, 130(9):2337–2348, 2022.
- Patch-mix transformer for unsupervised domain adaptation: A game perspective. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3561–3571, 2023.
- A comprehensive survey on transfer learning. Proceedings of the IEEE, 109(1):43–76, 2020.
- Thomas Westfechtel (2 papers)
- Dexuan Zhang (4 papers)
- Tatsuya Harada (142 papers)