On Large Visual Language Models for Medical Imaging Analysis: An Empirical Study (2402.14162v1)

Published 21 Feb 2024 in cs.CV and cs.AI

Abstract: Recently, LLMs have taken the spotlight in natural language processing. Further, integrating LLMs with vision enables the users to explore emergent abilities with multimodal data. Visual LLMs (VLMs), such as LLaVA, Flamingo, or CLIP, have demonstrated impressive performance on various visio-linguistic tasks. Consequently, there are enormous applications of large models that could be potentially used in the biomedical imaging field. Along that direction, there is a lack of related work to show the ability of large models to diagnose the diseases. In this work, we study the zero-shot and few-shot robustness of VLMs on the medical imaging analysis tasks. Our comprehensive experiments demonstrate the effectiveness of VLMs in analyzing biomedical images such as brain MRIs, microscopic images of blood cells, and chest X-rays.

PDF HTML Abstract

Summarize PDF Markdown Bookmark Chat (Pro)

References (27)

Authors (3)

Minh-Hao Van (12 papers)
Prateek Verma (39 papers)
Xintao Wu (70 papers)

Citations (13)

View on Semantic Scholar

On Large Visual Language Models for Medical Imaging Analysis: An Empirical Study (2402.14162v1)

Related Papers