The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) (2403.14473v1)
Abstract: With the introduction of ChatGPT, LLMs have received enormous attention in healthcare. Despite their potential benefits, researchers have underscored various ethical implications. While individual instances have drawn much attention, the debate lacks a systematic overview of practical applications currently researched and ethical issues connected to them. Against this background, this work aims to map the ethical landscape surrounding the current stage of deployment of LLMs in medicine and healthcare. Electronic databases and preprint servers were queried using a comprehensive search strategy. Studies were screened and extracted following a modified rapid review approach. Methodological quality was assessed using a hybrid approach. For 53 records, a meta-aggregative synthesis was performed. Four fields of applications emerged and testify to a vivid exploration phase. Advantages of using LLMs are attributed to their capacity in data analysis, personalized information provisioning, support in decision-making, mitigating information loss and enhancing information accessibility. However, we also identifies recurrent ethical concerns connected to fairness, bias, non-maleficence, transparency, and privacy. A distinctive concern is the tendency to produce harmful misinformation or convincingly but inaccurate content. A recurrent plea for ethical guidance and human oversight is evident. Given the variety of use cases, it is suggested that the ethical guidance debate be reframed to focus on defining what constitutes acceptable human oversight across the spectrum of applications. This involves considering diverse settings, varying potentials for harm, and different acceptable thresholds for performance and certainty in healthcare. In addition, a critical inquiry is necessary to determine the extent to which the current experimental use of LLMs is necessary and justified.
- “Challenges and Applications of Large Language Models” arXiv, 2023 DOI: 10.48550/ARXIV.2307.10169
- “On the Opportunities and Risks of Foundation Models” arXiv, 2022 DOI: 10.48550/arXiv.2108.07258
- Peter Lee, Carey Goldberg and Isaac Kohane “The AI Revolution in Medicine: GPT-4 and Beyond” Hoboken: Pearson, 2023
- Peter Lee, Sebastien Bubeck and Joseph Petro “Benefits, Limits, and Risks of GPT-4 as an AI Chatbot for Medicine” In New England Journal of Medicine 388.13 Massachusetts Medical Society, 2023, pp. 1233–1239 DOI: 10.1056/NEJMsr2214184
- “Large Language Models in Medicine” In Nature Medicine 29.8, 2023, pp. 1930–1940 DOI: 10.1038/s41591-023-02448-8
- “The Future Landscape of Large Language Models in Medicine” In Communications Medicine 3.1, 2023, pp. 141 DOI: 10.1038/s43856-023-00370-1
- Malik Sallam “ChatGPT Utility in Healthcare Education, Research, and Practice: Systematic Review on the Promising Perspectives and Valid Concerns” In Healthcare 11.6, 2023, pp. 887 DOI: 10.3390/healthcare11060887
- Tirth Dave, Sai Anirudh Athaluri and Satyam Singh “ChatGPT in Medicine: An Overview of Its Applications, Advantages, Limitations, Future Prospects, and Ethical Considerations” In Frontiers in artificial intelligence 6, 2023, pp. 1169595–1169595 DOI: 10.3389/frai.2023.1169595
- Diane M. Korngiebel and Sean D. Mooney “Considering the Possibilities and Pitfalls of Generative Pre-trained Transformer 3 (GPT-3) in Healthcare Delivery” In npj Digital Medicine 4.1, 2021, pp. 93 DOI: 10.1038/s41746-021-00464-x
- “ChatGPT in Healthcare: A Taxonomy and Systematic Review” In Computer Methods and Programs in Biomedicine 245, 2024, pp. 108013 DOI: 10.1016/j.cmpb.2024.108013
- “Assessing the Utility of ChatGPT Throughout the Entire Clinical Workflow” medRxiv, 2023 DOI: 10.1101/2023.02.21.23285886
- Hao Liu, Yifan Peng and Chunhua Weng “How Good Is ChatGPT for Medication Evidence Synthesis?” In Studies in Health Technology & Informatics 302, 2023, pp. 1062–1066 DOI: 10.3233/SHTI230347
- “Diagnostic Performance Comparison between Generative AI and Physicians: A Systematic Review and Meta-Analysis” medRxiv, 2024, pp. 2024.01.20.24301563 DOI: 10.1101/2024.01.20.24301563
- “Reliability of ChatGPT for Performing Triage Task in the Emergency Department Using the Korean Triage and Acuity Scale” In DIGITAL HEALTH 10 SAGE Publications Ltd, 2024 DOI: 10.1177/20552076241227132
- “Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum” In JAMA Internal Medicine 183.6, 2023, pp. 589–596 DOI: 10.1001/jamainternmed.2023.1838
- “Large Language Models Encode Clinical Knowledge” In Nature 620.7972, 2023, pp. 172–180 DOI: 10.1038/s41586-023-06291-2
- Thilo Hagendorff “Mapping the Ethics of Generative AI: A Comprehensive Scoping Review” arXiv, 2024 DOI: 10.48550/ARXIV.2402.08323
- “On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?” In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency, 2021, pp. 610–623 DOI: 10.1145/3442188.3445922
- “Taxonomy of Risks Posed by Language Models” In 2022 ACM Conference on Fairness, Accountability, and Transparency, 2022, pp. 214–229 DOI: 10.1145/3531146.3533088
- Partha Pratim Ray “ChatGPT: A Comprehensive Review on Background, Applications, Key Challenges, Bias, Ethics, Limitations and Future Scope” In Internet of Things and Cyber-Physical Systems 3, 2023, pp. 121–154 DOI: 10.1016/j.iotcps.2023.04.003
- “Large Language Model AI Chatbots Require Approval as Medical Devices” In Nature Medicine 29.10, 2023, pp. 2396–2398 DOI: 10.1038/s41591-023-02412-6
- “Ethics of Large Language Models in Medicine and Medical Research” In The Lancet Digital Health 5.6, 2023, pp. e333–e335 DOI: 10.1016/S2589-7500(23)00083-3
- “Ethical Considerations of Using ChatGPT in Health Care” In Journal of Medical Internet Research 25, 2023, pp. e48009 DOI: 10.2196/48009
- “A Paradigm Shift?: On the Ethics of Medical Large Language Models.” In Bioethics, forthcoming
- Abubakar Abid, Maheen Farooqi and James Zou “Large Language Models Associate Muslims with Violence” In Nature Machine Intelligence 3.6, 2021, pp. 461–463 DOI: 10.1038/s42256-021-00359-2
- “AI Chatbots Not yet Ready for Clinical Use” medRxiv, 2023 DOI: 10.1101/2023.03.02.23286705
- “Large Language Models Propagate Race-Based Medicine” In npj Digital Medicine 6.1, 2023, pp. 195 DOI: 10.1038/s41746-023-00939-z
- “Assessing the Potential of GPT-4 to Perpetuate Racial and Gender Biases in Health Care: A Model Evaluation Study” In The Lancet Digital Health 6.1, 2024, pp. e12–e22 DOI: 10.1016/S2589-7500(23)00225-X
- Harini Suresh and John V. Guttag “A Framework for Understanding Sources of Harm throughout the Machine Learning Life Cycle” In EAAMO ’21: Equity and Access in Algorithms, Mechanisms, and Optimization, 2021, pp. 1–9 DOI: 10.1145/3465416.3483305
- “Dissecting Racial Bias in an Algorithm Used to Manage the Health of Populations” In Science 366.6464, 2019, pp. 447–453 DOI: 10.1126/science.aax2342
- “Delayed Diagnosis of a Transient Ischemic Attack Caused by ChatGPT” In Wiener klinische Wochenschrift, 2024 DOI: 10.1007/s00508-024-02329-1
- “Foundation Models for Generalist Medical Artificial Intelligence” In Nature 616.7956 Nature Publishing Group, 2023, pp. 259–265 DOI: 10.1038/s41586-023-05881-4
- “Ethics of ChatGPT: A Systemic Review of Large Language Models in Healthcare and Medicine” In PROSPERO 2023. CRD42023431326
- “Updated Recommendations for the Cochrane Rapid Review Methods Guidance for Rapid Reviews of Effectiveness” In BMJ 384 British Medical Journal Publishing Group, 2024, pp. e076335 DOI: 10.1136/bmj-2023-076335
- “Large Language Models in Medical Education: Opportunities, Challenges, and Future Directions” In JMIR Medical Education 9, 2023, pp. e48291 DOI: 10.2196/48291
- “Generating Scholarly Content with ChatGPT: Ethical Challenges for Medical Publishing” In The Lancet Digital Health 5.3 Elsevier, 2023, pp. e105–e106 DOI: 10.1016/S2589-7500(23)00019-5
- Marcel Mertz “How to Tackle the Conundrum of Quality Appraisal in Systematic Reviews of Normative Literature/Information? Analysing the Problems of Three Possible Strategies (Translation of a German Paper)” In BMC Medical Ethics 20.1, 2019, pp. 81 DOI: 10.1186/s12910-019-0423-5
- “Predicting Dementia from Spontaneous Speech Using Large Language Models.” In PLoS Digital Health 1.12, 2022, pp. e0000168 DOI: 10.1371/journal.pdig.0000168
- Hazrat Ali, Junaid Qadir and Zubair Shah “ChatGPT and Large Language Models (LLMs) in Healthcare: Opportunities and Risks” Techrxiv, 2023 DOI: 10.36227/techrxiv.22579852.v2
- “Enhancing Expert Panel Discussions in Pediatric Palliative Care: Innovative Scenario Development and Summarization with ChatGPT-4” In Curēus 15.4, 2023, pp. e38249 DOI: 10.7759/cureus.38249
- “Evaluating the Performance of ChatGPT in Ophthalmology: An Analysis of Its Successes and Shortcomings” medRxiv, 2023 DOI: 10.1101/2023.01.22.23284882
- “Large Language Models in Sport Science & Medicine: Opportunities, Risks and Considerations” arXiv, 2023 DOI: 10.48550/arXiv.2305.03851
- “A Step-by-Step Researcher’s Guide to the Use of an AI-based Transformer in Epidemiology: An Exploratory Analysis of ChatGPT Using the STROBE Checklist for Observational Studies” In Journal of Public Health, 2023 DOI: 10.1007/s10389-023-01936-y
- Emilio Ferrara “Should ChatGPT Be Biased? Challenges and Risks of Bias in Large Language Models” arXiv, 2023 DOI: 10.48550/arXiv.2304.03738
- “neuroGPT-X: Towards an Accountable Expert Opinion Tool for Vestibular Schwannoma” medRxiv, 2023 DOI: 10.1101/2023.02.25.23286117
- Ralf E Harskamp and Lukas De Clercq “Performance of ChatGPT as an AI-assisted Decision Support Tool in Medicine: A Proof-of-Concept Study for Interpreting Symptoms and Management of Common Cardiac Conditions (AMSTELHEART-2)” medRxiv, 2023 DOI: 10.1101/2023.03.25.23285475
- “An Exploratory Survey about Using ChatGPT in Education, Healthcare, and Research” medRxiv, 2023 DOI: 10.1101/2023.03.31.23287979
- “Assessment of ChatGPT in the Preclinical Management of Ophthalmological Emergencies – an Analysis of Ten Fictional Case Vignettes” medRxiv, 2023 DOI: 10.1101/2023.04.16.23288645
- “ChatGPT and the Rise of Large Language Models: The New AI-driven Infodemic Threat in Public Health” In Frontiers in public health 11.(De Angelis, Baglivo, Arzilli, Privitera, Rizzo) Department of Translational Research and New Technologies in Medicine and Surgery, University of Pisa, Pisa, Italy, 2023, pp. 1166120–1166120 DOI: 10.3389/fpubh.2023.1166120
- “ChatGPT in Occupational Medicine: A Comparative Study with Human Experts” medRxiv, 2023 DOI: 10.1101/2023.05.17.23290055
- “Bias Amplification in Intersectional Subpopulations for Clinical Phenotyping by Large Language Models” medRxiv, 2023 DOI: 10.1101/2023.03.22.23287585
- “A Context-Based Chatbot Surpasses Trained Radiologists and Generic ChatGPT in Following the ACR Appropriateness Guidelines” medRxiv, 2023 DOI: 10.1101/2023.04.10.23288354
- “Harnessing Artificial Intelligence for Health Message Generation: The Folic Acid Message Engine.” In Journal of Medical Internet Research 24.1, 2022, pp. e28858 DOI: 10.2196/28858
- “Let’s Have a Chat! A Conversation with ChatGPT: Technology, Applications, and Limitations” arXiv, 2023 DOI: 10.48550/arXiv.2302.13817
- “Applications of Natural Language Processing at Emergency Department Triage: A Systematic Review” medRxiv, 2022 DOI: 10.1101/2022.12.20.22283735
- “Utility of GPT-4 as an Informational Patient Resource in Otolaryngology” medRxiv, 2023 DOI: 10.1101/2023.05.14.23289944
- “Evaluating Large Language Models on Medical Evidence Summarization” medRxiv, 2023 DOI: 10.1101/2023.04.22.23288967
- “Assessing the Performance of ChatGPT in Answering Questions Regarding Cirrhosis and Hepatocellular Carcinoma” medRxiv, 2023 DOI: 10.1101/2023.02.06.23285449
- “GPT-4 Outperforms ChatGPT in Answering Non-English Questions Related to Cirrhosis” medRxiv, 2023 DOI: 10.1101/2023.05.04.23289482
- Chiwon Ahn “Exploring ChatGPT for Information of Cardiopulmonary Resuscitation.” In Resuscitation 185, 2023, pp. 109729 DOI: 10.1016/j.resuscitation.2023.109729
- S. Arslan “Exploring the Potential of Chat GPT in Personalized Obesity Treatment” In Annals of Biomedical Engineering 51.9 United States: Springer, 2023, pp. 1887–1888 DOI: 10.1007/s10439-023-03227-9
- “Consulting ChatGPT: Ethical Dilemmas in Language Model Artificial Intelligence” In Journal of the American Academy of Dermatology, 2023 DOI: 10.1016/j.jaad.2023.02.052
- Giovanni Buzzaccarini, Rebecca Susanna Degliuomini and Marco Borin “The Artificial Intelligence Application in Aesthetic Medicine: How ChatGPT Can Revolutionize the Aesthetic World” In Aesthetic plastic surgery 47.5, 2023, pp. 2211–2212 DOI: 10.1007/s00266-023-03416-w
- “Potential Use of Artificial Intelligence in Infectious Disease: Take ChatGPT as an Example” In Annals of biomedical engineering 51.6, 2023, pp. 1130–1135 DOI: 10.1007/s10439-023-03203-3
- Rohun Gupta, Kazimir Bagdady and Brian A. Mailey “Ethical Concerns amidst Employment of ChatGPT in Plastic Surgery” In Aesthetic surgery journal 43.8, 2023, pp. NP656–NP657 DOI: 10.1093/asj/sjad108
- Alex Howard, William Hope and Alessandro Gerada “ChatGPT and Antimicrobial Advice: The End of the Consulting Infection Doctor?” In Lancet Infectious Diseases 23.4, 2023, pp. 405–406 DOI: 10.1016/S1473-3099(23)00113-5
- Wenbo Li, Yinxu Zhang and Fengmin Chen “ChatGPT in Colorectal Surgery: A Promising Tool or a Passing Fad?” In Annals of biomedical engineering 51.9, 2023, pp. 1892–1897 DOI: 10.1007/s10439-023-03232-y
- Roy H Perlis “Research Letter: Application of GPT-4 to Select next-Step Antidepressant Treatment in Major Depression” medRxiv, 2023 DOI: 10.1101/2023.04.14.23288595
- “GPT-4: A New Era of Artificial Intelligence in Medicine” In Irish journal of medical science 192.6, 2023 DOI: 10.1007/s11845-023-03377-8
- “The Artificial Intelligence Large Language Models and Neuropsychiatry Practice and Research Ethic” In Asian journal of psychiatry 84, 2023, pp. 103577 DOI: 10.1016/j.ajp.2023.103577
- “ChatGPT: Threat or Boon to the Future of Pharmacy Practice?” In Research in Social & Administrative Pharmacy 19.7, 2023, pp. 975–976 DOI: 10.1016/j.sapharm.2023.03.012
- “Harvesting the Power of Artificial Intelligence for Surgery: Uses, Implications, and Ethical Considerations” In The American surgeon 89.12, 2023, pp. 5102–5104 DOI: 10.1177/00031348231175454
- A.J. Page, N.M. Tumelty and S.K. Sheppard “Navigating the AI Frontier: Ethical Considerations and Best Practices in Microbial Genomics Research” In Microbial genomics 9.6 United Kingdom: NLM (Medline), 2023 DOI: 10.1099/mgen.0.001049
- Om P. Singh “Artificial Intelligence in the Era of ChatGPT - Opportunities and Challenges in Mental Health Care.” In Indian Journal of Psychiatry 65.3, 2023, pp. 297–298 DOI: 10.4103/indianjpsychiatry.indianjpsychiatry_112_23
- Sandra P. Thomas “Grappling with the Implications of ChatGPT for Researchers, Clinicians, and Educators.” In Issues in Mental Health Nursing 44.3, 2023, pp. 141–142 DOI: 10.1080/01612840.2023.2180982
- Patricia S. Yoder-Wise “This Is a Real Editorial or Is It?” In Journal of Continuing Education in Nursing 54.3, 2023, pp. 99–100 DOI: 10.3928/00220124-20230214-01
- M. Sallam “The Utility of ChatGPT as an Example of Large Language Models in Healthcare Education, Research and Practice: Systematic Review on the Future Perspectives and Potential Limitations” United States: medRxiv, 2023 DOI: 10.1101/2023.02.19.23286155
- “Overview of Early ChatGPT’s Presence in Medical Literature: Insights from a Hybrid Literature Review by ChatGPT and Human Experts” In Curēus 15.4, 2023, pp. e37281 DOI: 10.7759/cureus.37281
- “Faithful AI in Healthcare and Medicine” medRxiv, 2023 DOI: 10.1101/2023.04.18.23288752
- “Will ChatGPT Undermine Ethical Values in Nursing Education, Research, and Practice?” In Nursing inquiry 30.3 Australia: NLM (Medline), 2023, pp. e12556 DOI: 10.1111/nin.12556
- “The Complex Ethics of Applying ChatGPT and Language Model Artificial Intelligence in Dermatology” In Journal of the American Academy of Dermatology 89.4 United States: NLM (Medline), 2023, pp. e157–e158 DOI: 10.1016/j.jaad.2023.05.054
- “ProteinChat: Towards Achieving ChatGPT-Like Functionalities on Protein 3D Structures” Techrxiv, 2023 DOI: 10.36227/techrxiv.23120606.v1
- Geoffrey M. Currie “Academic Integrity and Artificial Intelligence: Is ChatGPT Hype, Hero or Heresy?” In Seminars in nuclear medicine 53.5, 2023, pp. 719–730 DOI: 10.1053/j.semnuclmed.2023.04.008
- Florin Eggmann and Markus B Blatz “ChatGPT: Chances and Challenges for Dentistry.” In Compendium of Continuing Education in Dentistry 44.4, 2023, pp. 220–224
- “ChatGPT and Conversational Artificial Intelligence: Friend, Foe, or Future of Research?” In The American journal of emergency medicine 70, 2023, pp. 81–83 DOI: 10.1016/j.ajem.2023.05.018
- S. Harrer “Attention Is Not All You Need: The Complicated Case of Ethically Using Large Language Models in Healthcare and Medicine” In eBioMedicine 90.(Harrer) Digital Health Cooperative Research Centre, Melbourne, Australia, 2023, pp. 104512–104512 DOI: 10.1016/j.ebiom.2023.104512
- Centaine L. Snoswell, Nazanin Falconer and Aaron J Snoswell “Pharmacist vs Machine: Pharmacy Services in the Age of Large Language Models.” In Research in Social & Administrative Pharmacy 19.6, 2023, pp. 843–844 DOI: 10.1016/j.sapharm.2023.03.006
- Emre Kazim and Adriano Soares Koshiyama “A High-Level Overview of AI Ethics” In Patterns 2.9, 2021, pp. 100314 DOI: 10.1016/j.patter.2021.100314
- Thilo Hagendorff “The Ethics of AI Ethics: An Evaluation of Guidelines” In Minds and Machines 30.1, 2020, pp. 99–120 DOI: 10.1007/s11023-020-09517-8
- Anna Jobin, Marcello Ienca and Effy Vayena “The Global Landscape of AI Ethics Guidelines” In Nature Machine Intelligence 1.9 Nature Publishing Group, 2019, pp. 389–399 DOI: 10.1038/s42256-019-0088-2
- “The Ethics of AI in Health Care: A Mapping Review” In Social Science & Medicine 260, 2020, pp. 113172 DOI: 10.1016/j.socscimed.2020.113172
- Ziwei Xu, Sanjay Jain and Mohan Kankanhalli “Hallucination Is Inevitable: An Innate Limitation of Large Language Models” arXiv, 2024 DOI: 10.48550/arXiv.2401.11817
- “On the Ethics of Algorithmic Decision-Making in Healthcare” In Journal of medical ethics 46.3, 2020, pp. 205–211 DOI: 10.1136/medethics-2019-105586
- Thomas Grote “Trustworthy Medical AI Systems Need to Know When They Don’t Know” In Journal of Medical Ethics 47.5 Institute of Medical Ethics, 2021, pp. 337–338 DOI: 10.1136/medethics-2021-107463
- “Epistemo-Ethical Constraints on AI-human Decision Making for Diagnostic Purposes” In Ethics and Information Technology 24.2, 2022, pp. 22 DOI: 10.1007/s10676-022-09629-y
- Ibo van de Poel “Why New Technologies Should Be Conceived as Social Experiments” In Ethics, Policy & Environment 16.3 Routledge, 2013, pp. 352–355 DOI: 10.1080/21550085.2013.844575
- Ibo van de Poel “An Ethical Framework for Evaluating Experimental Technology” In Science and Engineering Ethics 22.3, 2016, pp. 667–686 DOI: 10.1007/s11948-015-9724-3