Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MeDSLIP: Medical Dual-Stream Language-Image Pre-training for Fine-grained Alignment (2403.10635v1)

Published 15 Mar 2024 in cs.CV

Abstract: Vision-language pre-training (VLP) models have shown significant advancements in the medical domain. Yet, most VLP models align raw reports to images at a very coarse level, without modeling fine-grained relationships between anatomical and pathological concepts outlined in reports and the corresponding semantic counterparts in images. To address this problem, we propose a Medical Dual-Stream Language-Image Pre-training (MeDSLIP) framework. Specifically, MeDSLIP establishes vision-language fine-grained alignments via disentangling visual and textual representations into anatomy-relevant and pathology-relevant streams. Moreover, a novel vision-language Prototypical Contr-astive Learning (ProtoCL) method is adopted in MeDSLIP to enhance the alignment within the anatomical and pathological streams. MeDSLIP further employs cross-stream Intra-image Contrastive Learning (ICL) to ensure the consistent coexistence of paired anatomical and pathological concepts within the same image. Such a cross-stream regularization encourages the model to exploit the synchrony between two streams for a more comprehensive representation learning. MeDSLIP is evaluated under zero-shot and supervised fine-tuning settings on three public datasets: NIH CXR14, RSNA Pneumonia, and SIIM-ACR Pneumothorax. Under these settings, MeDSLIP outperforms six leading CNN-based models on classification, grounding, and segmentation tasks.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (9)
  1. Wenrui Fan (5 papers)
  2. Mohammod Naimul Islam Suvon (2 papers)
  3. Shuo Zhou (28 papers)
  4. Xianyuan Liu (12 papers)
  5. Samer Alabed (8 papers)
  6. Venet Osmani (17 papers)
  7. Andrew Swift (7 papers)
  8. Chen Chen (752 papers)
  9. Haiping Lu (37 papers)
Citations (3)
X Twitter Logo Streamline Icon: https://streamlinehq.com