Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Transitive Vision-Language Prompt Learning for Domain Generalization (2404.18758v1)

Published 29 Apr 2024 in cs.CV and cs.LG

Abstract: The vision-language pre-training has enabled deep models to make a huge step forward in generalizing across unseen domains. The recent learning method based on the vision-language pre-training model is a great tool for domain generalization and can solve this problem to a large extent. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. However, there are still some issues that an advancement still suffers from trading-off between domain invariance and class separability, which are crucial in current DG problems. In this paper, we introduce a novel prompt learning strategy that leverages deep vision prompts to address domain invariance while utilizing language prompts to ensure class separability, coupled with adaptive weighting mechanisms to balance domain invariance and class separability. Extensive experiments demonstrate that deep vision prompts effectively extract domain-invariant features, significantly improving the generalization ability of deep models and achieving state-of-the-art performance on three datasets.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Liyuan Wang (33 papers)
  2. Yan Jin (35 papers)
  3. Zhen Chen (151 papers)
  4. Jinlin Wu (37 papers)
  5. Mengke Li (19 papers)
  6. Yang Lu (158 papers)
  7. Hanzi Wang (66 papers)