Increasing Textual Context Size Boosts Medical Image-Text Matching

Published 23 Mar 2023 in cs.LG and cs.CV | (2303.13340v1)

Abstract: This short technical report demonstrates a simple technique that yields state of the art results in medical image-text matching tasks. We analyze the use of OpenAI's CLIP, a general image-text matching model, and observe that CLIP's limited textual input size has negative impact on downstream performance in the medical domain where encoding longer textual contexts is often required. We thus train and release ClipMD, which is trained with a simple sliding window technique to encode textual captions. ClipMD was tested on two medical image-text datasets and compared with other image-text matching models. The results show that ClipMD outperforms other models on both datasets by a large margin. We make our code and pretrained model publicly available.

Abstract PDF Upgrade to Chat

Citations (1)

View on Semantic Scholar

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Paper Prompts

Top Community Prompts

Explain it Like I'm 14

off on

Knowledge Gaps

off on

Practical Applications

off on

Glossary

off on

Conceptual Simplification

off on

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Generate Now

Continue Learning

We haven't generated follow-up questions for this paper yet.

Generate Now

Increasing Textual Context Size Boosts Medical Image-Text Matching

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Authors (2)

Collections

Increasing Textual Context Size Boosts Medical Image-Text Matching

Summary

Paper to Video (Beta)

Whiteboard

Paper Prompts

Top Community Prompts

Open Problems

Continue Learning

Related Papers

Authors (2)

Collections