Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
60 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
8 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Improving Generalizability of Extracting Social Determinants of Health Using Large Language Models through Prompt-tuning (2403.12374v1)

Published 19 Mar 2024 in cs.CL

Abstract: The progress in NLP using LLMs has greatly improved patient information extraction from clinical narratives. However, most methods based on the fine-tuning strategy have limited transfer learning ability for cross-domain applications. This study proposed a novel approach that employs a soft prompt-based learning architecture, which introduces trainable prompts to guide LLMs toward desired outputs. We examined two types of LLM architectures, including encoder-only GatorTron and decoder-only GatorTronGPT, and evaluated their performance for the extraction of social determinants of health (SDoH) using a cross-institution dataset from the 2022 n2c2 challenge and a cross-disease dataset from the University of Florida (UF) Health. The results show that decoder-only LLMs with prompt tuning achieved better performance in cross-domain applications. GatorTronGPT achieved the best F1 scores for both datasets, outperforming traditional fine-tuned GatorTron by 8.9% and 21.8% in a cross-institution setting, and 5.5% and 14.5% in a cross-disease setting.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Cheng Peng (177 papers)
  2. Zehao Yu (41 papers)
  3. Kaleb E Smith (14 papers)
  4. Wei-Hsuan Lo-Ciganic (2 papers)
  5. Jiang Bian (229 papers)
  6. Yonghui Wu (115 papers)