Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PMC-CLIP: Contrastive Language-Image Pre-training using Biomedical Documents (2303.07240v1)

Published 13 Mar 2023 in cs.CV, cs.CL, cs.LG, and cs.MM

Abstract: Foundation models trained on large-scale dataset gain a recent surge in CV and NLP. In contrast, development in biomedical domain lags far behind due to data scarcity. To address this issue, we build and release PMC-OA, a biomedical dataset with 1.6M image-caption pairs collected from PubMedCentral's OpenAccess subset, which is 8 times larger than before. PMC-OA covers diverse modalities or diseases, with majority of the image-caption samples aligned at finer-grained level, i.e., subfigure and subcaption. While pretraining a CLIP-style model on PMC-OA, our model named PMC-CLIP achieves state-of-the-art results on various downstream tasks, including image-text retrieval on ROCO, MedMNIST image classification, Medical VQA, i.e. +8.1% R@10 on image-text retrieval, +3.9% accuracy on image classification.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Weixiong Lin (10 papers)
  2. Ziheng Zhao (11 papers)
  3. Xiaoman Zhang (31 papers)
  4. Chaoyi Wu (24 papers)
  5. Ya Zhang (222 papers)
  6. Yanfeng Wang (211 papers)
  7. Weidi Xie (132 papers)
Citations (92)

Summary

We haven't generated a summary for this paper yet.