Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

CODER: Coupled Diversity-Sensitive Momentum Contrastive Learning for Image-Text Retrieval (2208.09843v1)

Published 21 Aug 2022 in cs.CV

Abstract: Image-Text Retrieval (ITR) is challenging in bridging visual and lingual modalities. Contrastive learning has been adopted by most prior arts. Except for limited amount of negative image-text pairs, the capability of constrastive learning is restricted by manually weighting negative pairs as well as unawareness of external knowledge. In this paper, we propose our novel Coupled Diversity-Sensitive Momentum Constrastive Learning (CODER) for improving cross-modal representation. Firstly, a novel diversity-sensitive contrastive learning (DCL) architecture is invented. We introduce dynamic dictionaries for both modalities to enlarge the scale of image-text pairs, and diversity-sensitiveness is achieved by adaptive negative pair weighting. Furthermore, two branches are designed in CODER. One learns instance-level embeddings from image/text, and it also generates pseudo online clustering labels for its input image/text based on their embeddings. Meanwhile, the other branch learns to query from commonsense knowledge graph to form concept-level descriptors for both modalities. Afterwards, both branches leverage DCL to align the cross-modal embedding spaces while an extra pseudo clustering label prediction loss is utilized to promote concept-level representation learning for the second branch. Extensive experiments conducted on two popular benchmarks, i.e. MSCOCO and Flicker30K, validate CODER remarkably outperforms the state-of-the-art approaches.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Haoran Wang (142 papers)
  2. Dongliang He (46 papers)
  3. Wenhao Wu (71 papers)
  4. Boyang Xia (8 papers)
  5. Min Yang (239 papers)
  6. Fu Li (86 papers)
  7. Yunlong Yu (34 papers)
  8. Zhong Ji (39 papers)
  9. Errui Ding (156 papers)
  10. Jingdong Wang (236 papers)
Citations (19)