SDoH-GPT: Using Large Language Models to Extract Social Determinants of Health (SDoH) (2407.17126v1)
Abstract: Extracting social determinants of health (SDoH) from unstructured medical notes depends heavily on labor-intensive annotations, which are typically task-specific, hampering reusability and limiting sharing. In this study we introduced SDoH-GPT, a simple and effective few-shot LLM method leveraging contrastive examples and concise instructions to extract SDoH without relying on extensive medical annotations or costly human intervention. It achieved tenfold and twentyfold reductions in time and cost respectively, and superior consistency with human annotators measured by Cohen's kappa of up to 0.92. The innovative combination of SDoH-GPT and XGBoost leverages the strengths of both, ensuring high accuracy and computational efficiency while consistently maintaining 0.90+ AUROC scores. Testing across three distinct datasets has confirmed its robustness and accuracy. This study highlights the potential of leveraging LLMs to revolutionize medical note classification, demonstrating their capability to achieve highly accurate classifications with significantly reduced time and cost.
- Bernardo Consoli (1 paper)
- Xizhi Wu (5 papers)
- Song Wang (313 papers)
- Xinyu Zhao (54 papers)
- Yanshan Wang (50 papers)
- Justin Rousseau (3 papers)
- Tom Hartvigsen (6 papers)
- Li Shen (363 papers)
- Huanmei Wu (4 papers)
- Yifan Peng (147 papers)
- Qi Long (47 papers)
- Tianlong Chen (202 papers)
- Ying Ding (126 papers)