Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Enhancing Large Language Models for Clinical Decision Support by Incorporating Clinical Practice Guidelines (2401.11120v2)

Published 20 Jan 2024 in cs.CL and cs.AI

Abstract: Background LLMs, enhanced with Clinical Practice Guidelines (CPGs), can significantly improve Clinical Decision Support (CDS). However, methods for incorporating CPGs into LLMs are not well studied. Methods We develop three distinct methods for incorporating CPGs into LLMs: Binary Decision Tree (BDT), Program-Aided Graph Construction (PAGC), and Chain-of-Thought-Few-Shot Prompting (CoT-FSP). To evaluate the effectiveness of the proposed methods, we create a set of synthetic patient descriptions and conduct both automatic and human evaluation of the responses generated by four LLMs: GPT-4, GPT-3.5 Turbo, LLaMA, and PaLM 2. Zero-Shot Prompting (ZSP) was used as the baseline method. We focus on CDS for COVID-19 outpatient treatment as the case study. Results All four LLMs exhibit improved performance when enhanced with CPGs compared to the baseline ZSP. BDT outperformed both CoT-FSP and PAGC in automatic evaluation. All of the proposed methods demonstrated high performance in human evaluation. Conclusion LLMs enhanced with CPGs demonstrate superior performance, as compared to plain LLMs with ZSP, in providing accurate recommendations for COVID-19 outpatient treatment, which also highlights the potential for broader applications beyond the case study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. “Wordcraft: Story Writing With Large Language Models” In 27th International Conference on Intelligent User Interfaces, IUI ’22 Helsinki, Finland: Association for Computing Machinery, 2022, pp. 841–852 DOI: 10.1145/3490099.3511105
  2. “Autonomous chemical research with large language models” In Nature 624.7992, 2023, pp. 570–578 DOI: 10.1038/s41586-023-06792-0
  3. “Chatbot vs Medical Student Performance on Free-Response Clinical Reasoning Examinations” In JAMA Internal Medicine 183.9, 2023, pp. 1028–1030 DOI: 10.1001/jamainternmed.2023.2909
  4. “ChatGPT Responses to Common Questions about Anterior Cruciate Ligament Reconstruction Are Frequently Satisfactory” In Arthroscopy: The Journal of Arthroscopic & Related Surgery, 2024 DOI: https://doi.org/10.1016/j.arthro.2023.12.009
  5. “Evaluating large language models on medical evidence summarization” In npj Digital Medicine 6.1, 2023, pp. 158 DOI: 10.1038/s41746-023-00896-7
  6. “A large language model for electronic health records” In npj Digital Medicine 5.1, 2022, pp. 194 DOI: 10.1038/s41746-022-00742-2
  7. “Large language models encode clinical knowledge” In Nature 620.7972, 2023, pp. 172–180 DOI: 10.1038/s41586-023-06291-2
  8. “Language Models are Few-Shot Learners” In Advances in Neural Information Processing Systems 33 Curran Associates, Inc., 2020, pp. 1877–1901 URL: https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf
  9. “What Can Transformers Learn In-Context? A Case Study of Simple Function Classes” In Advances in Neural Information Processing Systems, 2022 URL: https://openreview.net/forum?id=flNZJ2eOet
  10. “Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing” In ACM Comput. Surv. 55.9 New York, NY, USA: Association for Computing Machinery, 2023 DOI: 10.1145/3560815
  11. “Prompt Programming for Large Language Models: Beyond the Few-Shot Paradigm” In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, CHI EA ’21 Yokohama, Japan: Association for Computing Machinery, 2021 DOI: 10.1145/3411763.3451760
  12. “Training language models to follow instructions with human feedback” In Advances in Neural Information Processing Systems, 2022 URL: https://openreview.net/forum?id=TG8KACxEON
  13. “Self-Consistency Improves Chain of Thought Reasoning in Language Models” In The Eleventh International Conference on Learning Representations, 2023 URL: https://openreview.net/forum?id=1PL1NIMMrw
  14. “Tree of Thoughts: Deliberate Problem Solving with Large Language Models”, 2023 arXiv:2305.10601 [cs.CL]
  15. “Chain of Thought Prompting Elicits Reasoning in Large Language Models” In Advances in Neural Information Processing Systems, 2022 URL: https://openreview.net/forum?id=_VjQlMeSB_J
  16. “The technical landscape for patient-centered CDS: progress, gaps, and challenges” In Journal of the American Medical Informatics Association 29.6, 2022, pp. 1101–1105 DOI: 10.1093/jamia/ocac029
  17. “An overview of clinical decision support systems: benefits, risks, and strategies for success” In npj Digital Medicine 3.1, 2020, pp. 17 DOI: 10.1038/s41746-020-0221-y
  18. “A Roadmap for National Action on Clinical Decision Support” In Journal of the American Medical Informatics Association 14.2, 2007, pp. 141–145 DOI: 10.1197/jamia.M2334
  19. “Challenges and opportunities for advancing patient-centered clinical decision support: findings from a horizon scan” In Journal of the American Medical Informatics Association 29.7, 2022, pp. 1233–1243 DOI: 10.1093/jamia/ocac059
  20. “A lifecycle framework illustrates eight stages necessary for realizing the benefits of patient-centered clinical decision support ” In Journal of the American Medical Informatics Association 30.9, 2023, pp. 1583–1589 DOI: 10.1093/jamia/ocad122
  21. OpenAI “Introducing ChatGPT” URL: https://openai.com/blog/chatgpt
  22. “Using AI-generated suggestions from ChatGPT to optimize clinical decision support” In Journal of the American Medical Informatics Association 30.7, 2023, pp. 1237–1245 DOI: 10.1093/jamia/ocad072
  23. “Leveraging Large Language Models for Decision Support in Personalized Oncology” In JAMA Network Open 6.11, 2023, pp. e2343689–e2343689 DOI: 10.1001/jamanetworkopen.2023.43689
  24. “Evaluating the Clinical Decision-Making Ability of Large Language Models Using MKSAP-19 Cardiology Questions” In JACC: Advances 2.9, 2023, pp. 100658 DOI: 10.1016/j.jacadv.2023.100658
  25. “ChatGPT and large language models in orthopedics: from education and surgery to research” In Journal of Experimental Orthopaedics 10.1, 2023, pp. 128 DOI: 10.1186/s40634-023-00700-1
  26. Infectious Diseases Society of America “COVID-19 Outpatient Treatment Guidelines Roadmap” Last Updated: February 2, 2023. Accessed: 2023-12-28, https://www.idsociety.org/covid-19-real-time-learning-network/therapeutics-and-interventions/covid-19-outpatient-treatment-guidelines-roadmap/#/+/0/publishedDate_na_dt/desc/
  27. “PAL: Program-aided Language Models” In Proceedings of the 40th International Conference on Machine Learning 202, Proceedings of Machine Learning Research PMLR, 2023, pp. 10764–10799 URL: https://proceedings.mlr.press/v202/gao23f.html
  28. “GPT-4 Technical Report”, 2023 arXiv:2303.08774 [cs.CL]
  29. OpenAI “GPT-3.5” Accessed: 2023-12-28, https://platform.openai.com/docs/models/gpt-3-5
  30. “LLaMA: Open and Efficient Foundation Language Models”, 2023 arXiv:2302.13971 [cs.CL]
  31. “PaLM 2 Technical Report”, 2023 arXiv:2305.10403 [cs.CL]
  32. Tianyu Gao, Adam Fisch and Danqi Chen “Making Pre-trained Language Models Better Few-shot Learners” In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers) Online: Association for Computational Linguistics, 2021, pp. 3816–3830 DOI: 10.18653/v1/2021.acl-long.295
  33. “A comparison of Cohen’s Kappa and Gwet’s AC1 when calculating inter-rater reliability coefficients: a study conducted with personality disorder samples” In BMC Medical Research Methodology 13.1, 2013, pp. 61 DOI: 10.1186/1471-2288-13-61
  34. J.Richard Landis and Gary G. Koch “The Measurement of Observer Agreement for Categorical Data” Full publication date: Mar., 1977 In Biometrics 33.1 [Wiley, International Biometric Society], 1977, pp. 159–174 DOI: 10.2307/2529310
  35. “Adopting and expanding ethical principles for generative artificial intelligence from military to healthcare” In npj Digital Medicine 6.1, 2023, pp. 225 DOI: 10.1038/s41746-023-00965-x
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. David Oniani (14 papers)
  2. Xizhi Wu (5 papers)
  3. Shyam Visweswaran (21 papers)
  4. Sumit Kapoor (3 papers)
  5. Shravan Kooragayalu (1 paper)
  6. Katelyn Polanska (2 papers)
  7. Yanshan Wang (50 papers)
Citations (3)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets