Automated Clinical Data Extraction with Knowledge Conditioned LLMs (2406.18027v2)
Abstract: The extraction of lung lesion information from clinical and medical imaging reports is crucial for research on and clinical care of lung-related diseases. LLMs can be effective at interpreting unstructured text in reports, but they often hallucinate due to a lack of domain-specific knowledge, leading to reduced accuracy and posing challenges for use in clinical settings. To address this, we propose a novel framework that aligns generated internal knowledge with external knowledge through in-context learning (ICL). Our framework employs a retriever to identify relevant units of internal or external knowledge and a grader to evaluate the truthfulness and helpfulness of the retrieved internal-knowledge rules, to align and update the knowledge bases. Experiments with expert-curated test datasets demonstrate that this ICL approach can increase the F1 score for key fields (lesion size, margin and solidity) by an average of 12.9% over existing ICL methods.
- Diya Li (5 papers)
- Asim Kadav (22 papers)
- Aijing Gao (2 papers)
- Rui Li (384 papers)
- Richard Bourgon (2 papers)