Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Medical Visual Question Answering: A Survey (2111.10056v3)

Published 19 Nov 2021 in cs.CV and cs.AI

Abstract: Medical Visual Question Answering~(VQA) is a combination of medical artificial intelligence and popular VQA challenges. Given a medical image and a clinically relevant question in natural language, the medical VQA system is expected to predict a plausible and convincing answer. Although the general-domain VQA has been extensively studied, the medical VQA still needs specific investigation and exploration due to its task features. In the first part of this survey, we collect and discuss the publicly available medical VQA datasets up-to-date about the data source, data quantity, and task feature. In the second part, we review the approaches used in medical VQA tasks. We summarize and discuss their techniques, innovations, and potential improvements. In the last part, we analyze some medical-specific challenges for the field and discuss future research directions. Our goal is to provide comprehensive and helpful information for researchers interested in the medical visual question answering field and encourage them to conduct further research in this field.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zhihong Lin (20 papers)
  2. Donghao Zhang (13 papers)
  3. Qingyi Tao (16 papers)
  4. Danli Shi (20 papers)
  5. Gholamreza Haffari (141 papers)
  6. Qi Wu (323 papers)
  7. Mingguang He (22 papers)
  8. Zongyuan Ge (102 papers)
Citations (77)

Summary

We haven't generated a summary for this paper yet.