Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence (2507.02975v1)

Published 30 Jun 2025 in cs.LG and cs.IR

Abstract: The growing use of LLMs for biomedical question answering raises concerns about the accuracy and evidentiary support of their responses. To address this, we present Answered with Evidence, a framework for evaluating whether LLM-generated answers are grounded in scientific literature. We analyzed thousands of physician-submitted questions using a comparative pipeline that included: (1) Alexandria, fka the Atropos Evidence Library, a retrieval-augmented generation (RAG) system based on novel observational studies, and (2) two PubMed-based retrieval-augmented systems (System and Perplexity). We found that PubMed-based systems provided evidence-supported answers for approximately 44% of questions, while the novel evidence source did so for about 50%. Combined, these sources enabled reliable answers to over 70% of biomedical queries. As LLMs become increasingly capable of summarizing scientific content, maximizing their value will require systems that can accurately retrieve both published and custom-generated evidence or generate such evidence in real time.

Collections

Summary

We haven't generated a summary for this paper yet.

Summarize Now

Paper Prompts

Explore 10 Community Prompts

Follow-up Questions

We haven't generated follow-up questions for this paper yet.

Generate Now

Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence (2507.02975v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Authors (5)

Don't miss out on important new AI/ML research

Introducing Answered with Evidence -- a framework for evaluating whether LLM responses to biomedical questions are founded in evidence (2507.02975v1)

Collections

Summary

Paper Prompts

Follow-up Questions

Related Papers

Authors (5)

Don't miss out on important new AI/ML research