Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare (2410.06566v1)

Published 9 Oct 2024 in cs.CL

Abstract: Biased AI-generated medical advice and misdiagnoses can jeopardize patient safety, making the integrity of AI in healthcare more critical than ever. As LLMs take on a growing role in medical decision-making, addressing their biases and enhancing their accuracy is key to delivering safe, reliable care. This study addresses these challenges head-on by introducing new resources designed to promote ethical and precise AI in healthcare. We present two datasets: BiasMD, featuring 6,007 question-answer pairs crafted to evaluate and mitigate biases in health-related LLM outputs, and DiseaseMatcher, with 32,000 clinical question-answer pairs spanning 700 diseases, aimed at assessing symptom-based diagnostic accuracy. Using these datasets, we developed the EthiClinician, a fine-tuned model built on the ChatDoctor framework, which outperforms GPT-4 in both ethical reasoning and clinical judgment. By exposing and correcting hidden biases in existing models for healthcare, our work sets a new benchmark for safer, more reliable patient outcomes.

PDF HTML Abstract

Summarize Bookmark Chat (Pro)

Authors (2)

Pardis Sadat Zahraei (5 papers)
Zahra Shakeri (12 papers)

Detecting Bias and Enhancing Diagnostic Accuracy in Large Language Models for Healthcare (2410.06566v1)

Related Papers