Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Answering real-world clinical questions using large language model based systems (2407.00541v1)

Published 29 Jun 2024 in cs.CL, cs.AI, and cs.IR

Abstract: Evidence to guide healthcare decisions is often limited by a lack of relevant and trustworthy literature as well as difficulty in contextualizing existing research for a specific patient. LLMs could potentially address both challenges by either summarizing published literature or generating new studies based on real-world data (RWD). We evaluated the ability of five LLM-based systems in answering 50 clinical questions and had nine independent physicians review the responses for relevance, reliability, and actionability. As it stands, general-purpose LLMs (ChatGPT-4, Claude 3 Opus, Gemini Pro 1.5) rarely produced answers that were deemed relevant and evidence-based (2% - 10%). In contrast, retrieval augmented generation (RAG)-based and agentic LLM systems produced relevant and evidence-based answers for 24% (OpenEvidence) to 58% (ChatRWD) of questions. Only the agentic ChatRWD was able to answer novel questions compared to other LLMs (65% vs. 0-9%). These results suggest that while general-purpose LLMs should not be used as-is, a purpose-built system for evidence summarization based on RAG and one for generating novel evidence working synergistically would improve availability of pertinent evidence for patient care.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (27)
  1. Yen Sia Low (1 paper)
  2. Michael L. Jackson (2 papers)
  3. Rebecca J. Hyde (1 paper)
  4. Robert E. Brown (1 paper)
  5. Neil M. Sanghavi (1 paper)
  6. Julian D. Baldwin (1 paper)
  7. C. William Pike (1 paper)
  8. Jananee Muralidharan (1 paper)
  9. Gavin Hui (1 paper)
  10. Natasha Alexander (1 paper)
  11. Hadeel Hassan (1 paper)
  12. Rahul V. Nene (1 paper)
  13. Morgan Pike (1 paper)
  14. Courtney J. Pokrzywa (1 paper)
  15. Shivam Vedak (2 papers)
  16. Adam Paul Yan (1 paper)
  17. Dong-han Yao (2 papers)
  18. Amy R. Zipursky (1 paper)
  19. Christina Dinh (1 paper)
  20. Philip Ballentine (1 paper)
Citations (3)