Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
80 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Automated Clinical Data Extraction with Knowledge Conditioned LLMs (2406.18027v2)

Published 26 Jun 2024 in cs.CL and cs.AI

Abstract: The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. LLMs can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Diya Li (5 papers)
  2. Asim Kadav (22 papers)
  3. Aijing Gao (2 papers)
  4. Rui Li (384 papers)
  5. Richard Bourgon (2 papers)
Citations (5)
X Twitter Logo Streamline Icon: https://streamlinehq.com