Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Are Generative AI systems Capable of Supporting Information Needs of Patients? (2402.00234v1)

Published 31 Jan 2024 in cs.HC, cs.AI, cs.CL, and cs.LG

Abstract: Patients managing a complex illness such as cancer face a complex information challenge where they not only must learn about their illness but also how to manage it. Close interaction with healthcare experts (radiologists, oncologists) can improve patient learning and thereby, their disease outcome. However, this approach is resource intensive and takes expert time away from other critical tasks. Given the recent advancements in Generative AI models aimed at improving the healthcare system, our work investigates whether and how generative visual question answering systems can responsibly support patient information needs in the context of radiology imaging data. We conducted a formative need-finding study in which participants discussed chest computed tomography (CT) scans and associated radiology reports of a fictitious close relative with a cardiothoracic radiologist. Using thematic analysis of the conversation between participants and medical experts, we identified commonly occurring themes across interactions, including clarifying medical terminology, locating the problems mentioned in the report in the scanned image, understanding disease prognosis, discussing the next diagnostic steps, and comparing treatment options. Based on these themes, we evaluated two state-of-the-art generative visual LLMs against the radiologist's responses. Our results reveal variability in the quality of responses generated by the models across various themes. We highlight the importance of patient-facing generative AI systems to accommodate a diverse range of conversational themes, catering to the real-world informational needs of patients.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. R. J. Adams. Improving Health Outcomes with Better Patient Understanding and Education. Risk management and healthcare policy, pages 61–72, 2010.
  2. Understanding Patient Needs and Gaps in Radiology Reports Through Online Discussion Forum Analysis. Insights into Imaging, 12(1):1–9, 2021.
  3. Flamingo: A Visual Language Model for Few-shot Learning. Advances in Neural Information Processing Systems, 35:23716–23736, 2022.
  4. A. Arora and A. Arora. The Promise of Large Language Models in Healthcare. The Lancet, 401(10377):641, 2023.
  5. Openflamingo: An Open-source Framework for Training Large Autoregressive Vision-language Models. arXiv preprint arXiv:2308.01390, 2023.
  6. MSCXR-T: Learning to Exploit Temporal Structure for Biomedical Vision-Language Processing, 2023.
  7. Health Literacy Interventions and Outcomes: An Updated Systematic Review. Evidence report/technology assessment, (199):1—941, March 2011.
  8. Language Models are Few-shot Learners. Advances in neural information processing systems, 33:1877–1901, 2020.
  9. Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
  10. Palm: Scaling Language Modeling with Pathways. Journal of Machine Learning Research, 24(240):1–113, 2023.
  11. A Systematic Review of Patient Education Strategies for Oncology Patients in Low-and Middle-Income Countries. The Oncologist, 28(1):2–11, 2023.
  12. Quality of Online Resources for Pancreatic Cancer Patients. Journal of Cancer Education, 34:223–228, 2019.
  13. “So What if ChatGPT Wrote it?” Multidisciplinary Perspectives on Opportunities, Challenges and Implications of Generative Conversational AI for Research, Practice and Policy. International Journal of Information Management, 71:102642, 2023.
  14. Patient-clinician communication: American Society of Clinical Oncology Consensus Guideline. Obstetrical & Gynecological Survey, 73(2):96–97, 2018.
  15. The Invisible Radiologist. Radiology, 258(1):18–22, 2011.
  16. MedAlpaca: An Open-Source Collection of Medical Conversational AI Models and Training Data. arXiv preprint arXiv:2304.08247, 2023.
  17. A Survey of Large Language Models for Healthcare: From Data, Technology, and Applications to Accountability and Ethics. arXiv preprint arXiv:2310.05694, 2023.
  18. Self-management Education Interventions for Patients with Cancer: A Systematic Review. Supportive Care in Cancer, 25:1323–1355, 2017.
  19. What disease does this patient have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. Applied Sciences, 11(14):6421, 2021.
  20. Patient-Centered Radiology: Where are We, Where do We Want to Be, and How do We Get There? Radiology, 285(2):601–608, 2017.
  21. Llava-med: Training a Large Language-and-vision Assistant for Biomedicine in One Day. arXiv preprint arXiv:2306.00890, 2023.
  22. Halueval: A Large-scale Hallucination Evaluation Benchmark for Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, pages 6449–6464, 2023.
  23. Blip-2: Bootstrapping Language-image Pre-training with Frozen Image Encoders and Large Language Models. arXiv preprint arXiv:2301.12597, 2023.
  24. C.-Y. Lin. ROUGE: A Package for Automatic Evaluation of Summaries. In Text Summarization Branches Out, Proceedings of the ACL-04 Workshop, pages 74–81, 2004.
  25. Truthfulqa: Measuring How Models Mimic Human Falsehoods. arXiv preprint arXiv:2109.07958, 2021.
  26. A Medical Multimodal Large Language Model for Future Pandemics. NPJ Digital Medicine, 6(1):226, 2023.
  27. Visual Instruction Tuning. arXiv preprint arXiv:2304.08485, 2023.
  28. Exploring the Boundaries of GPT-4 in Radiology. arXiv preprint arXiv:2310.14573, 2023.
  29. Medflamingo: A Multimodal Medical Few-shot Learner. In Machine Learning for Health (ML4H), pages 353–367. PMLR, 2023.
  30. Thematic Analysis: Striving to Meet the Trustworthiness Criteria. International journal of qualitative methods, 16(1):1609406917733847, 2017.
  31. Gpt-4 Technical Report, 2023.
  32. Medmcqa: A Large-scale Multi-subject Multi-choice Dataset for Medical Domain Question Answering. In Conference on Health, Inference, and Learning, pages 248–260. PMLR, 2022.
  33. BLEU: A Method for Automatic Evaluation of Machine Translation. In Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics, pages 311–318. Association for Computational Linguistics, 2002.
  34. S. B. Patel and K. Lam. ChatGPT: The Future of Discharge Summaries? The Lancet Digital Health, 5(3):e107–e108, 2023.
  35. Language Models are Unsupervised Multitask Learners. OpenAI blog, 1(8):9, 2019.
  36. Large Language Models Encode Clinical Knowledge. Nature, 620(7972):172–180, 2023.
  37. Towards Expert-level Medical Question Answering with Large Language Models. arXiv preprint arXiv:2305.09617, 2023.
  38. CheXbert: Combining Automatic Labelers and Expert Annotations for Accurate Radiology Report Labeling using BERT. arXiv preprint arXiv:2004.09167, 2020.
  39. Large Language Models in Medicine. Nature medicine, 29(8):1930–1940, 2023.
  40. Expert-level Detection of Pathologies from Unannotated Chest X-ray Images via Self-supervised Learning. Nature Biomedical Engineering, 6(12):1399–1406, 2022.
  41. Clinical Camel: An Open-Source Expert-level Medical Language Model with Dialogue-based Knowledge Encoding. arXiv preprint arXiv:2305.12031, 2023.
  42. Llama: Open and Efficient Foundation Language Models. arXiv preprint arXiv:2302.13971, 2023.
  43. Llama 2: Open Foundation and Fine-tuned Chat Models. arXiv preprint arXiv:2307.09288, 2023.
  44. Huatuo: Tuning llama Model with Chinese Medical Knowledge. arXiv preprint arXiv:2304.06975, 2023.
  45. Pmc-llama: Further Finetuning llama on Medical Papers. arXiv preprint arXiv:2304.14454, 2023.
  46. Towards Generalist Foundation Model for Radiology. arXiv preprint arXiv:2308.02463, 2023.
  47. Faithful AI in Medicine: A Systematic Review with Large Language Models and Beyond. medRxiv, 2023.
  48. Harnessing the Power of LLMs in Practice: A Survey on Chatgpt and Beyond. arXiv preprint arXiv:2304.13712, 2023.
  49. The Dawn of LMMs: Preliminary Explorations with GPT-4V(ision). arXiv preprint arXiv:2309.17421, 9(1), 2023.
  50. Chatdoctor: A Medical Chat Model Fine-tuned on llama Model using Medical Domain Knowledge. arXiv preprint arXiv:2303.14070, 2023.
  51. Large-scale Domain-specific Pretraining for Biomedical Vision-language Processing. arXiv preprint arXiv:2303.00915, 2023.
  52. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations, 2020.
  53. A Survey of Large Language Models. arXiv preprint arXiv:2303.18223, 2023.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Shreya Rajagopal (1 paper)
  2. Subhashis Hazarika (9 papers)
  3. Sookyung Kim (9 papers)
  4. Yan-ming Chiou (4 papers)
  5. Jae Ho Sohn (6 papers)
  6. Hari Subramonyam (11 papers)
  7. Shiwali Mohan (16 papers)
Citations (1)
X Twitter Logo Streamline Icon: https://streamlinehq.com