Domain-Agnostic Mutual Prompting for Unsupervised Domain Adaptation (2403.02899v1)
Abstract: Conventional Unsupervised Domain Adaptation (UDA) strives to minimize distribution discrepancy between domains, which neglects to harness rich semantics from data and struggles to handle complex domain shifts. A promising technique is to leverage the knowledge of large-scale pre-trained vision-LLMs for more guided adaptation. Despite some endeavors, current methods often learn textual prompts to embed domain semantics for source and target domains separately and perform classification within each domain, limiting cross-domain knowledge transfer. Moreover, prompting only the language branch lacks flexibility to adapt both modalities dynamically. To bridge this gap, we propose Domain-Agnostic Mutual Prompting (DAMP) to exploit domain-invariant semantics by mutually aligning visual and textual embeddings. Specifically, the image contextual information is utilized to prompt the language branch in a domain-agnostic and instance-conditioned way. Meanwhile, visual prompts are imposed based on the domain-agnostic textual prompt to elicit domain-invariant visual embeddings. These two branches of prompts are learned mutually with a cross-attention module and regularized with a semantic-consistency loss and an instance-discrimination contrastive loss. Experiments on three UDA benchmarks demonstrate the superiority of DAMP over state-of-the-art approaches.
- Exploring visual prompts for adapting large-scale models. arXiv preprint arXiv:2203.17274, 2022.
- Recognition in terra incognita. In ECCV, pages 456–473, 2018.
- Multi-prompt alignment for multi-source unsupervised domain adaptation. arXiv preprint arXiv:2209.15210, 2022.
- Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation. In ICML, pages 1081–1090. PMLR, 2019.
- Randaugment: Practical automated data augmentation with a reduced search space. In CVPR workshops, pages 702–703, 2020.
- An image is worth 16x16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929, 2020.
- Cross-domain gradient discrepancy minimization for unsupervised domain adaptation. In CVPR, pages 3937–3946, 2021.
- Unbiased metric learning: On the utilization of multiple datasets and web images for softening bias. In ICCV, pages 1657–1664, 2013.
- Partial feature selection and alignment for multi-source domain adaptation. In CVPR, pages 16654–16663, 2021.
- Decorate the newcomers: Visual domain prompt for continual test time adaptation. In AAAI, pages 7595–7603, 2023.
- Unsupervised domain adaptation by backpropagation. In ICML, pages 1180–1189. PMLR, 2015.
- Domain adaptation via prompt learning. TNNLS, pages 1–11, 2023.
- In search of lost domain generalization. In ICLR, 2020.
- Deep residual learning for image recognition. In CVPR, pages 770–778, 2016.
- Clip-s4: Language-guided self-supervised semantic segmentation. In CVPR, pages 11207–11216, 2023.
- Unit: Multimodal multitask learning with a unified transformer. In ICCV, pages 1439–1449, 2021.
- Learning discrete representations via information maximizing self-augmented training. In ICML, pages 1558–1567. PMLR, 2017.
- Scaling up visual and vision-language representation learning with noisy text supervision. In ICML, pages 4904–4916. PMLR, 2021.
- Visual prompt tuning. In ECCV, pages 709–727. Springer, 2022.
- Understanding and constructing latent modality structures in multi-modal representation learning. In CVPR, pages 7661–7671, 2023.
- Contrastive adaptation network for unsupervised domain adaptation. In CVPR, pages 4893–4902, 2019.
- Maple: Multi-modal prompt learning. In CVPR, pages 19113–19122, 2023.
- Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
- Padclip: Pseudo-labeling with adaptive debiasing in clip for unsupervised domain adaptation. In ICCV, pages 16155–16165, 2023.
- Deeper, broader and artier domain generalization. In ICCV, pages 5542–5550, 2017.
- Maximum density divergence for domain adaptation. TPAMI, 43(11):3918–3930, 2020.
- T-svdnet: Exploring high-order prototypical correlations for multi-source domain adaptation. In ICCV, pages 9991–10000, 2021.
- Do we really need to access the source data? source hypothesis transfer for unsupervised domain adaptation. In ICML, pages 6028–6039. PMLR, 2020.
- Mind the gap: Understanding the modality gap in multi-modal contrastive representation learning. In NeurIPS, pages 17612–17625, 2022.
- Cycle self-training for domain adaptation. In NeurIPS, pages 22968–22981, 2021.
- Learning transferable features with deep adaptation networks. In ICML, pages 97–105. PMLR, 2015.
- Deep transfer learning with joint adaptation networks. In ICML, pages 2208–2217. PMLR, 2017.
- Conditional adversarial domain adaptation. In NeurIPS, 2018.
- Sgdr: Stochastic gradient descent with warm restarts. arXiv preprint arXiv:1608.03983, 2016.
- Instance adaptive self-training for unsupervised domain adaptation. In ECCV, pages 415–430. Springer, 2020.
- Deep causal representation learning for unsupervised domain adaptation. arXiv preprint arXiv:1910.12417, 2019.
- A survey on transfer learning. TKDE, 22(10):1345–1359, 2009.
- Visda: The visual domain adaptation challenge. arXiv preprint arXiv:1710.06924, 2017.
- Moment matching for multi-source domain adaptation. In ICCV, pages 1406–1415, 2019.
- Learning transferable visual models from natural language supervision. In ICML, pages 8748–8763. PMLR, 2021.
- Denseclip: Language-guided dense prediction with context-aware prompting. In CVPR, pages 18082–18091, 2022.
- Multi-source unsupervised domain adaptation via pseudo target domain. TIP, 31:2122–2135, 2022.
- Maximum classifier discrepancy for unsupervised domain adaptation. In CVPR, pages 3723–3732, 2018.
- Generate to adapt: Aligning domains using generative adversarial networks. In CVPR, pages 8503–8512, 2018.
- Ad-clip: Adapting domains in prompt space using clip. In ICCV, pages 4355–4364, 2023.
- Fixmatch: Simplifying semi-supervised learning with consistency and confidence. In NeurIPS, pages 596–608, 2020.
- Deep coral: Correlation alignment for deep domain adaptation. In ECCV, pages 443–450. Springer, 2016.
- Safe self-refinement for transformer-based domain adaptation. In CVPR, pages 7191–7200, 2022.
- Unsupervised domain adaptation via structurally regularized deep clustering. In CVPR, pages 8725–8735, 2020.
- Training data-efficient image transformers & distillation through attention. In ICML, pages 10347–10357. PMLR, 2021.
- Multimodal transformer for unaligned multimodal language sequences. In ACL, page 6558, 2019.
- Deep domain confusion: Maximizing for domain invariance. arXiv preprint arXiv:1412.3474, 2014.
- Laurens Van der Maaten and Geoffrey Hinton. Visualizing data using t-sne. JMLR, 9(11), 2008.
- Attention is all you need. In NeurIPS, 2017.
- Your classifier can secretly suffice multi-source domain adaptation. pages 4647–4659, 2020.
- Deep hashing network for unsupervised domain adaptation. In CVPR, pages 5018–5027, 2017.
- Learning to combine: Knowledge aggregation for multi-source domain adaptation. In ECCV, pages 727–744. Springer, 2020.
- Debiased learning from naturally imbalanced pseudo-labels. In CVPR, pages 14647–14657, 2022.
- Toalign: task-oriented alignment for unsupervised domain adaptation. In NeurIPS, pages 13834–13846, 2021.
- Cdtrans: Cross-domain transformer for unsupervised domain adaptation. arXiv preprint arXiv:2109.06165, 2021.
- Tvt: Transferable vision transformer for unsupervised domain adaptation. In WACV, pages 520–530, 2023.
- Cpt: Colorful prompt tuning for pre-trained vision-language models. arXiv preprint arXiv:2109.11797, 2021.
- Turning a clip model into a scene text detector. In CVPR, pages 6978–6988, 2023.
- Make the u in uda matter: Invariant consistency learning for unsupervised domain adaptation. arXiv preprint arXiv:2309.12742, 2023.
- Instance-aware dynamic prompt tuning for pre-trained point cloud models. arXiv preprint arXiv:2304.07221, 2023.
- Domain prompt learning for efficiently adapting clip to unseen domains. TJSAI, 38(6):B–MC2_1, 2023a.
- Domain-symmetric networks for adversarial domain adaptation. In CVPR, pages 5031–5040, 2019.
- Towards effective instance discrimination contrastive loss for unsupervised domain adaptation. In ICCV, pages 11388–11399, 2023b.
- Multi-source distilling domain adaptation. In AAAI, pages 12975–12983, 2020.
- Domain adaptive ensemble learning. TIP, 30:8008–8018, 2021.
- Conditional prompt learning for vision-language models. In CVPR, pages 16816–16825, 2022a.
- Learning to prompt for vision-language models. IJCV, 130(9):2337–2348, 2022b.
- Aligning domain-specific distribution and classifier for cross-domain classification from multiple sources. In AAAI, pages 5989–5996, 2019.
- Zhekai Du (7 papers)
- Xinyao Li (8 papers)
- Fengling Li (18 papers)
- Ke Lu (35 papers)
- Lei Zhu (280 papers)
- Jingjing Li (98 papers)