Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CWCL: Cross-Modal Transfer with Continuously Weighted Contrastive Loss (2309.14580v1)

Published 26 Sep 2023 in cs.LG, cs.AI, and cs.CV

Abstract: This paper considers contrastive training for cross-modal 0-shot transfer wherein a pre-trained model in one modality is used for representation learning in another domain using pairwise data. The learnt models in the latter domain can then be used for a diverse set of tasks in a zero-shot way, similar to Contrastive Language-Image Pre-training (CLIP)'' andLocked-image Tuning (LiT)'' that have recently gained considerable attention. Most existing works for cross-modal representation alignment (including CLIP and LiT) use the standard contrastive training objective, which employs sets of positive and negative examples to align similar and repel dissimilar training data samples. However, similarity amongst training examples has a more continuous nature, thus calling for a more `non-binary' treatment. To address this, we propose a novel loss function called Continuously Weighted Contrastive Loss (CWCL) that employs a continuous measure of similarity. With CWCL, we seek to align the embedding space of one modality with another. Owing to the continuous nature of similarity in the proposed loss function, these models outperform existing methods for 0-shot transfer across multiple models, datasets and modalities. Particularly, we consider the modality pairs of image-text and speech-text and our models achieve 5-8% (absolute) improvement over previous state-of-the-art methods in 0-shot image classification and 20-30% (absolute) improvement in 0-shot speech-to-intent classification and keyword classification.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Rakshith Sharma Srinivasa (5 papers)
  2. Jaejin Cho (24 papers)
  3. Chouchang Yang (4 papers)
  4. Yashas Malur Saidutta (5 papers)
  5. Ching-Hua Lee (6 papers)
  6. Yilin Shen (41 papers)
  7. Hongxia Jin (64 papers)
Citations (6)