OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue (2306.12174v2)

Published 21 Jun 2023 in cs.CV

Abstract: Large multimodal LLMs (LMMs) have achieved significant success in general domains. However, due to the significant differences between medical images and text and general web content, the performance of LMMs in medical scenarios is limited. In ophthalmology, clinical diagnosis relies on multiple modalities of medical images, but unfortunately, multimodal ophthalmic LLMs have not been explored to date. In this paper, we study and construct an ophthalmic large multimodal model. Firstly, we use fundus images as an entry point to build a disease assessment and diagnosis pipeline to achieve common ophthalmic disease diagnosis and lesion segmentation. Then, we establish a new ophthalmic multimodal instruction-following and dialogue fine-tuning dataset based on disease-related knowledge data and publicly available real-world medical dialogue. We introduce visual ability into the LLM to complete the ophthalmic large language and vision assistant (OphGLM). Our experimental results demonstrate that the OphGLM model performs exceptionally well, and it has the potential to revolutionize clinical applications in ophthalmology. The dataset, code, and models will be made publicly available at https://github.com/ML-AILab/OphGLM.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

References (26)

Authors (13)

Weihao Gao (30 papers)
Zhuo Deng (16 papers)
Zhiyuan Niu (3 papers)
Fuju Rong (2 papers)
Chucheng Chen (2 papers)
Zheng Gong (69 papers)
Wenze Zhang (3 papers)
Daimin Xiao (1 paper)
Fang Li (142 papers)
Zhenjie Cao (2 papers)
Zhaoyi Ma (1 paper)
Wenbin Wei (4 papers)
Lan Ma (31 papers)

Citations (29)

View on Semantic Scholar

GitHub

GitHub - ML-AILab/OphGLM: The first ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue (20 stars)

OphGLM: Training an Ophthalmology Large Language-and-Vision Assistant based on Instructions and Dialogue (2306.12174v2)

Related Papers

GitHub