Papers
Topics
Authors
Recent
Assistant
AI Research Assistant
Well-researched responses based on relevant abstracts and paper content.
Custom Instructions Pro
Preferences or requirements that you'd like Emergent Mind to consider when generating responses.
Gemini 2.5 Flash
Gemini 2.5 Flash 159 tok/s
Gemini 2.5 Pro 54 tok/s Pro
GPT-5 Medium 26 tok/s Pro
GPT-5 High 27 tok/s Pro
GPT-4o 34 tok/s Pro
Kimi K2 200 tok/s Pro
GPT OSS 120B 433 tok/s Pro
Claude Sonnet 4.5 36 tok/s Pro
2000 character limit reached

XDR-LVLM: An Explainable Vision-Language Large Model for Diabetic Retinopathy Diagnosis (2508.15168v1)

Published 21 Aug 2025 in cs.CV

Abstract: Diabetic Retinopathy (DR) is a major cause of global blindness, necessitating early and accurate diagnosis. While deep learning models have shown promise in DR detection, their black-box nature often hinders clinical adoption due to a lack of transparency and interpretability. To address this, we propose XDR-LVLM (eXplainable Diabetic Retinopathy Diagnosis with LVLM), a novel framework that leverages Vision-Language Large Models (LVLMs) for high-precision DR diagnosis coupled with natural language-based explanations. XDR-LVLM integrates a specialized Medical Vision Encoder, an LVLM Core, and employs Multi-task Prompt Engineering and Multi-stage Fine-tuning to deeply understand pathological features within fundus images and generate comprehensive diagnostic reports. These reports explicitly include DR severity grading, identification of key pathological concepts (e.g., hemorrhages, exudates, microaneurysms), and detailed explanations linking observed features to the diagnosis. Extensive experiments on the Diabetic Retinopathy (DDR) dataset demonstrate that XDR-LVLM achieves state-of-the-art performance, with a Balanced Accuracy of 84.55% and an F1 Score of 79.92% for disease diagnosis, and superior results for concept detection (77.95% BACC, 66.88% F1). Furthermore, human evaluations confirm the high fluency, accuracy, and clinical utility of the generated explanations, showcasing XDR-LVLM's ability to bridge the gap between automated diagnosis and clinical needs by providing robust and interpretable insights.

Summary

We haven't generated a summary for this paper yet.

Dice Question Streamline Icon: https://streamlinehq.com

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Lightbulb Streamline Icon: https://streamlinehq.com

Continue Learning

We haven't generated follow-up questions for this paper yet.

List To Do Tasks Checklist Streamline Icon: https://streamlinehq.com

Collections

Sign up for free to add this paper to one or more collections.