Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Diagnosis and Large Language Models: A New Front for Medical Misinformation (2307.04910v1)

Published 10 Jul 2023 in cs.CY

Abstract: Improving healthcare quality and access remains a critical concern for countries worldwide. Consequently, the rise of LLMs has erupted a wealth of discussion around healthcare applications among researchers and consumers alike. While the ability of these models to pass medical exams has been used to argue in favour of their use in medical training and diagnosis, the impact of their inevitable use as a self-diagnostic tool and their role in spreading healthcare misinformation has not been evaluated. In this work, we critically evaluate LLMs' capabilities from the lens of a general user self-diagnosing, as well as the means through which LLMs may aid in the spread of medical misinformation. To accomplish this, we develop a testing methodology which can be used to evaluate responses to open-ended questions mimicking real-world use cases. In doing so, we reveal that a) these models perform worse than previously known, and b) they exhibit peculiar behaviours, including overconfidence when stating incorrect recommendations, which increases the risk of spreading medical misinformation.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (58)
  1. V. Taecharungroj, ““What Can ChatGPT Do?” Analyzing Early Reactions to the Innovative AI Chatbot on Twitter,” Big Data and Cognitive Computing, vol. 7, no. 1, p. 35, Mar. 2023, number: 1 Publisher: Multidisciplinary Digital Publishing Institute. [Online]. Available: https://www.mdpi.com/2504-2289/7/1/35
  2. M. Salah, H. Alhalbusi, M. M. Ismail, and F. Abdelfattah, “Chatting with ChatGPT: Decoding the Mind of Chatbot Users and Unveiling the Intricate Connections between User Perception, Trust and Stereotype Perception on Self-Esteem and Psychological Well-being,” In Review, preprint, Mar. 2023. [Online]. Available: https://www.researchsquare.com/article/rs-2610655/v2
  3. C. Goyder, A. McPherson, and P. Glasziou, “Self diagnosis,” BMJ, vol. 339, p. b4418, Nov. 2009, publisher: British Medical Journal Publishing Group Section: Practice. [Online]. Available: https://www.bmj.com/content/339/bmj.b4418
  4. R. W. White and E. Horvitz, “Experiences with Web Search on Medical Concerns and Self Diagnosis,” AMIA Annual Symposium Proceedings, vol. 2009, pp. 696–700, 2009. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2815378/
  5. B. M. Kuehn, “Clinician Shortage Exacerbates Pandemic-Fueled “Mental Health Crisis”,” JAMA, vol. 327, no. 22, pp. 2179–2181, Jun. 2022. [Online]. Available: https://doi.org/10.1001/jama.2022.8661
  6. M. Boniol, T. Kunjumen, T. S. Nair, A. Siyam, J. Campbell, and K. Diallo, “The global health workforce stock and distribution in 2020 and 2030: a threat to equity and ‘universal’ health coverage?” BMJ Global Health, vol. 7, no. 6, p. e009316, Jun. 2022, publisher: BMJ Specialist Journals Section: Original research. [Online]. Available: https://gh.bmj.com/content/7/6/e009316
  7. J.-P. Michel and F. Ecarnot, “The shortage of skilled workers in Europe: its impact on geriatric medicine,” European Geriatric Medicine, vol. 11, no. 3, pp. 345–347, Jun. 2020. [Online]. Available: https://doi.org/10.1007/s41999-020-00323-0
  8. S. Turale and A. Nantsupawat, “Clinician mental health, nursing shortages and the COVID-19 pandemic: Crises within crises,” International Nursing Review, vol. 68, no. 1, pp. 12–14, 2021, _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/inr.12674. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/inr.12674
  9. A. Horesh, “Using chatgpt to study medicine: Learn the basics,” Mar 2023. [Online]. Available: https://futuredoctor.ai/chatgpt/
  10. H. Lee, “The rise of ChatGPT: Exploring its potential in medical education,” Anatomical Sciences Education, Mar. 2023.
  11. S. Sedaghat, “Early applications of ChatGPT in medical practice, education and research,” Clinical Medicine (London, England), pp. clinmed.2023–0078, Apr. 2023.
  12. T. H. Kung, M. Cheatham, A. Medenilla, C. Sillos, L. D. Leon, C. Elepaño, M. Madriaga, R. Aggabao, G. Diaz-Candido, J. Maningo, and V. Tseng, “Performance of ChatGPT on USMLE: Potential for AI-assisted medical education using large language models,” PLOS Digital Health, vol. 2, no. 2, p. e0000198, Feb. 2023, publisher: Public Library of Science. [Online]. Available: https://journals.plos.org/digitalhealth/article?id=10.1371/journal.pdig.0000198
  13. L. Iftikhar et al., “Docgpt: Impact of chatgpt-3 on health services as a virtual doctor,” EC Paediatrics, vol. 12, no. 1, pp. 45–55, 2023.
  14. D. Primack, “Here come the robot doctors,” Jan 2023. [Online]. Available: https://www.axios.com/2023/01/18/chatgpt-ai-health-care-doctors
  15. USMLE, “Step 1 sample test questions,” Jun 2022. [Online]. Available: https://www.usmle.org/prepare-your-exam/step-1-materials/step-1-sample-test-questions
  16. K. Scott, “Microsoft teams up with openai to exclusively license gpt-3 language model,” Sep 2020. [Online]. Available: https://blogs.microsoft.com/blog/2020/09/22/microsoft-teams-up-with-openai-to-exclusively-license-gpt-3-language-model/
  17. A. Gilson, C. Safranek, T. Huang, V. Socrates, L. Chi, R. A. Taylor, and D. Chartash, “How Does ChatGPT Perform on the Medical Licensing Exams? The Implications of Large Language Models for Medical Education and Knowledge Assessment,” Dec. 2022, pages: 2022.12.23.22283901. [Online]. Available: https://www.medrxiv.org/content/10.1101/2022.12.23.22283901v1
  18. M. Liebrenz, R. Schleifer, A. Buadze, D. Bhugra, and A. Smith, “Generating scholarly content with ChatGPT: ethical challenges for medical publishing,” The Lancet Digital Health, vol. 5, no. 3, pp. e105–e106, Mar. 2023, publisher: Elsevier. [Online]. Available: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00019-5/fulltext
  19. A. S. George and A. S. H. George, “A Review of ChatGPT AI’s Impact on Several Business Sectors,” Partners Universal International Innovation Journal, vol. 1, no. 1, pp. 9–23, Feb. 2023, number: 1. [Online]. Available: https://puiij.com/index.php/research/article/view/11
  20. J. H. Choi, K. E. Hickman, A. Monahan, and D. Schwarcz, “ChatGPT Goes to Law School,” Rochester, NY, Jan. 2023. [Online]. Available: https://papers.ssrn.com/abstract=4335905
  21. J. De Winter, “Can ChatGPT pass high school exams on English Language Comprehension?” 2023, publisher: Unpublished. [Online]. Available: https://rgdoi.net/10.13140/RG.2.2.24094.20807
  22. E. A. M. van Dis, J. Bollen, W. Zuidema, R. van Rooij, and C. L. Bockting, “ChatGPT: five priorities for research,” Nature, vol. 614, no. 7947, pp. 224–226, Feb. 2023, bandiera_abtest: a Cg_type: Comment Number: 7947 Publisher: Nature Publishing Group Subject_term: Computer science, Research management, Publishing, Machine learning. [Online]. Available: https://www.nature.com/articles/d41586-023-00288-7
  23. M. Sallam, N. Salim, M. Barakat, and A. Al-Tammemi, “ChatGPT applications in medical, dental, pharmacy, and public health education: A descriptive study highlighting the advantages and limitations,” Narra J, vol. 3, no. 1, pp. e103–e103, Mar. 2023, number: 1. [Online]. Available: https://narraj.org/main/article/view/103
  24. X. Zhai, “ChatGPT User Experience: Implications for Education,” Rochester, NY, Dec. 2022. [Online]. Available: https://papers.ssrn.com/abstract=4312418
  25. M. Sallam, “ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns,” Healthcare, vol. 11, no. 6, p. 887, Jan. 2023, number: 6 Publisher: Multidisciplinary Digital Publishing Institute. [Online]. Available: https://www.mdpi.com/2227-9032/11/6/887
  26. F. Ufuk, “The Role and Limitations of Large Language Models Such as ChatGPT in Clinical Settings and Medical Journalism,” Radiology, p. 230276, Mar. 2023, publisher: Radiological Society of North America. [Online]. Available: https://pubs.rsna.org/doi/abs/10.1148/radiol.230276
  27. A. Arora and A. Arora, “The promise of large language models in health care,” The Lancet, vol. 401, no. 10377, p. 641, Feb. 2023, publisher: Elsevier. [Online]. Available: https://www.thelancet.com/journals/lancet/article/PIIS0140-6736(23)00216-7/fulltext
  28. S. S. Biswas, “Role of Chat GPT in Public Health,” Annals of Biomedical Engineering, Mar. 2023. [Online]. Available: https://link.springer.com/10.1007/s10439-023-03172-7
  29. J. Li, A. Dada, J. Kleesiek, and J. Egger, “ChatGPT in Healthcare: A Taxonomy and Systematic Review,” Mar. 2023, pages: 2023.03.30.23287899. [Online]. Available: https://www.medrxiv.org/content/10.1101/2023.03.30.23287899v1
  30. J. Holmes, Z. Liu, L. Zhang, Y. Ding, T. T. Sio, L. A. McGee, J. B. Ashman, X. Li, T. Liu, J. Shen, and W. Liu, “Evaluating Large Language Models on a Highly-specialized Topic, Radiation Oncology Physics,” Apr. 2023, arXiv:2304.01938 [physics]. [Online]. Available: http://arxiv.org/abs/2304.01938
  31. X. J. Tan, W. L. Cheor, L. L. Lim, K. S. Ab Rahman, and I. H. Bakrin, “Artificial Intelligence (AI) in Breast Imaging: A Scientometric Umbrella Review,” Diagnostics, vol. 12, no. 12, p. 3111, Dec. 2022. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9777253/
  32. P. Jayakumar, M. G. Moore, K. A. Furlough, L. M. Uhler, J. P. Andrawis, K. M. Koenig, N. Aksan, P. J. Rathouz, and K. J. Bozic, “Comparison of an Artificial Intelligence–Enabled Patient Decision Aid vs Educational Material on Decision Quality, Shared Decision-Making, Patient Experience, and Functional Outcomes in Adults With Knee Osteoarthritis,” JAMA Network Open, vol. 4, no. 2, p. e2037107, Feb. 2021. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC7893500/
  33. V. W. Xue, P. Lei, and W. C. Cho, “The potential impact of ChatGPT in clinical and translational medicine,” Clinical and Translational Medicine, vol. 13, no. 3, p. e1216, Mar. 2023. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9976604/
  34. S. B. Patel and K. Lam, “ChatGPT: the future of discharge summaries?” The Lancet Digital Health, vol. 5, no. 3, pp. e107–e108, Mar. 2023, publisher: Elsevier. [Online]. Available: https://www.thelancet.com/journals/landig/article/PIIS2589-7500(23)00021-3/fulltext
  35. S. Harrer, “Attention is not all you need: the complicated case of ethically using large language models in healthcare and medicine,” eBioMedicine, vol. 90, Apr. 2023, publisher: Elsevier. [Online]. Available: https://www.thelancet.com/journals/ebiom/article/PIIS2352-3964(23)00077-4/fulltext?ref=dedataverbinders.nl
  36. M. Cascella, J. Montomoli, V. Bellini, and E. Bignami, “Evaluating the Feasibility of ChatGPT in Healthcare: An Analysis of Multiple Clinical and Research Scenarios,” Journal of Medical Systems, vol. 47, no. 1, p. 33, Mar. 2023. [Online]. Available: https://doi.org/10.1007/s10916-023-01925-4
  37. M. Agrawal, S. Hegselmann, H. Lang, Y. Kim, and D. Sontag, “Large language models are few-shot clinical information extractors,” in Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing.   Abu Dhabi, United Arab Emirates: Association for Computational Linguistics, Dec. 2022, pp. 1998–2022. [Online]. Available: https://aclanthology.org/2022.emnlp-main.130
  38. X. Yang, A. Chen, N. PourNejatian, H. C. Shin, K. E. Smith, C. Parisien, C. Compas, C. Martin, A. B. Costa, M. G. Flores, Y. Zhang, T. Magoc, C. A. Harle, G. Lipori, D. A. Mitchell, W. R. Hogan, E. A. Shenkman, J. Bian, and Y. Wu, “A large language model for electronic health records,” npj Digital Medicine, vol. 5, no. 1, pp. 1–9, Dec. 2022, number: 1 Publisher: Nature Publishing Group. [Online]. Available: https://www.nature.com/articles/s41746-022-00742-2
  39. S. Wang, Z. Zhao, X. Ouyang, Q. Wang, and D. Shen, “ChatCAD: Interactive Computer-Aided Diagnosis on Medical Image using Large Language Models,” Feb. 2023, arXiv:2302.07257 [cs, eess]. [Online]. Available: http://arxiv.org/abs/2302.07257
  40. A. Rao, J. Kim, M. Kamineni, M. Pang, W. Lie, and M. D. Succi, “Evaluating ChatGPT as an Adjunct for Radiologic Decision-Making,” Feb. 2023, pages: 2023.02.02.23285399. [Online]. Available: https://www.medrxiv.org/content/10.1101/2023.02.02.23285399v1
  41. N. Horvat, H. Veeraraghavan, C. S. R. Nahas, D. D. B. Bates, F. R. Ferreira, J. Zheng, M. Capanu, J. L. Fuqua, M. C. Fernandes, R. E. Sosa, V. S. Jayaprakasam, G. G. Cerri, S. C. Nahas, and I. Petkovska, “Combined artificial intelligence and radiologist model for predicting rectal cancer treatment response from magnetic resonance imaging: an external validation study,” Abdominal Radiology (New York), vol. 47, no. 8, pp. 2770–2782, Aug. 2022.
  42. F. W. Pun, G. H. D. Leung, H. W. Leung, B. H. M. Liu, X. Long, I. V. Ozerov, J. Wang, F. Ren, A. Aliper, E. Izumchenko, A. Moskalev, J. P. de Magalhães, and A. Zhavoronkov, “Hallmarks of aging-based dual-purpose disease and age-associated targets predicted using PandaOmics AI-powered discovery engine,” Aging (Albany NY), vol. 14, no. 6, pp. 2475–2506, Mar. 2022. [Online]. Available: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC9004567/
  43. L. Zhu, W. Mou, and R. Chen, “Can the ChatGPT and other Large Language Models with internet-connected database solve the questions and concerns of patient with prostate cancer?” Mar. 2023, pages: 2023.03.06.23286827. [Online]. Available: https://www.medrxiv.org/content/10.1101/2023.03.06.23286827v1
  44. T. B. Arif, U. Munaf, and I. Ul-Haque, “The future of medical education and research: Is ChatGPT a blessing or blight in disguise?” Medical Education Online, vol. 28, no. 1, p. 2181052, Dec. 2023, publisher: Taylor & Francis _eprint: https://doi.org/10.1080/10872981.2023.2181052. [Online]. Available: https://doi.org/10.1080/10872981.2023.2181052
  45. B. V. Janssen, G. Kazemier, and M. G. Besselink, “The use of ChatGPT and other large language models in surgical science,” BJS Open, vol. 7, no. 2, p. zrad032, Apr. 2023. [Online]. Available: https://doi.org/10.1093/bjsopen/zrad032
  46. B. Swire-Thompson and D. Lazer, “Public Health and Online Misinformation: Challenges and Recommendations,” Annual Review of Public Health, vol. 41, no. 1, pp. 433–451, 2020, _eprint: https://doi.org/10.1146/annurev-publhealth-040119-094127. [Online]. Available: https://doi.org/10.1146/annurev-publhealth-040119-094127
  47. S. D. Lambert and C. G. Loiselle, “Combining individual interviews and focus groups to enhance data richness,” Journal of Advanced Nursing, vol. 62, no. 2, pp. 228–237, Apr. 2008.
  48. W. Jacobs, A. O. Amuta, and K. C. Jeon, “Health information seeking in the digital age: An analysis of health information seeking behavior among US adults,” Cogent Social Sciences, vol. 3, no. 1, p. 1302785, Jan. 2017, publisher: Cogent OA _eprint: https://doi.org/10.1080/23311886.2017.1302785. [Online]. Available: https://doi.org/10.1080/23311886.2017.1302785
  49. L. De Angelis, F. Baglivo, G. Arzilli, G. P. Privitera, P. Ferragina, A. E. Tozzi, and C. Rizzo, “ChatGPT and the Rise of Large Language Models: The New AI-Driven Infodemic Threat in Public Health,” Rochester, NY, Feb. 2023. [Online]. Available: https://papers.ssrn.com/abstract=4352931
  50. G. Zuccon and B. Koopman, “Dr ChatGPT, tell me what I want to hear: How prompt knowledge impacts health answer correctness,” Feb. 2023, arXiv:2302.13793 [cs]. [Online]. Available: http://arxiv.org/abs/2302.13793
  51. S. Moran, “Usmle® step 1 exam dates in 2023: What you need to know,” Nov 2022. [Online]. Available: https://blog.amboss.com/us/usmle-step-1-exam-dates
  52. The Princeton Review, “Everything you need to know about the usmle board exam,” 2023. [Online]. Available: https://www.princetonreview.com/med-school-advice/usmle
  53. OpenAI, 2023. [Online]. Available: https://platform.openai.com/
  54. ——, “Openai api models overview,” Apr 2023. [Online]. Available: https://platform.openai.com/docs/models
  55. M. Lombard, J. Snyder-Duch, and C. C. Bracken, “Content Analysis in Mass Communication: Assessment and Reporting of Intercoder Reliability,” Human Communication Research, vol. 28, no. 4, pp. 587–604, 2002, _eprint: https://onlinelibrary.wiley.com/doi/pdf/10.1111/j.1468-2958.2002.tb00826.x. [Online]. Available: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1468-2958.2002.tb00826.x
  56. P. Liu, W. Yuan, J. Fu, Z. Jiang, H. Hayashi, and G. Neubig, “Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing,” ACM Computing Surveys, vol. 55, no. 9, pp. 195:1–195:35, Jan. 2023. [Online]. Available: https://dl.acm.org/doi/10.1145/3560815
  57. F. A. Ozbay and B. Alatas, “Fake news detection within online social media using supervised artificial intelligence algorithms,” Physica A: Statistical Mechanics and its Applications, vol. 540, p. 123174, Feb. 2020. [Online]. Available: https://linkinghub.elsevier.com/retrieve/pii/S0378437119317546
  58. J. Ayoub, X. J. Yang, and F. Zhou, “Combat COVID-19 infodemic using explainable natural language processing models,” Information Processing & Management, vol. 58, no. 4, p. 102569, Jul. 2021. [Online]. Available: https://www.sciencedirect.com/science/article/pii/S0306457321000704
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Francois Barnard (2 papers)
  2. Marlize Van Sittert (1 paper)
  3. Sirisha Rambhatla (27 papers)
Citations (10)