Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models (2311.00287v2)

Published 1 Nov 2023 in cs.CL, cs.AI, cs.LG, and q-bio.QM

Abstract: Clinical natural language processing requires methods that can address domain-specific challenges, such as complex medical terminology and clinical contexts. Recently, LLMs have shown promise in this domain. Yet, their direct deployment can lead to privacy issues and are constrained by resources. To address this challenge, we delve into synthetic clinical text generation using LLMs for clinical NLP tasks. We propose an innovative, resource-efficient approach, ClinGen, which infuses knowledge into the process. Our model involves clinical knowledge extraction and context-informed LLM prompting. Both clinical topics and writing styles are drawn from external domain-specific knowledge graphs and LLMs to guide data generation. Our extensive empirical study across 7 clinical NLP tasks and 16 datasets reveals that ClinGen consistently enhances performance across various tasks, effectively aligning the distribution of real datasets and significantly enriching the diversity of generated training instances. Our code is available at \url{https://github.com/ritaranx/ClinGen}.

Citations (7)

View on Semantic Scholar

Summary

We haven't generated a summary for this paper yet.

Summarize Now

GitHub

GitHub - ritaranx/ClinGen: This is the code for our paper "Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models". (31 stars)

Tweets

https://twitter.com/XTXI/status/1884299859424919964

Knowledge-Infused Prompting: Assessing and Advancing Clinical Text Data Generation with Large Language Models (2311.00287v2)

Summary

Related Papers

GitHub

Tweets