Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
38 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AI Hospital: Benchmarking Large Language Models in a Multi-agent Medical Interaction Simulator (2402.09742v4)

Published 15 Feb 2024 in cs.CL

Abstract: Artificial intelligence has significantly advanced healthcare, particularly through LLMs that excel in medical question answering benchmarks. However, their real-world clinical application remains limited due to the complexities of doctor-patient interactions. To address this, we introduce \textbf{AI Hospital}, a multi-agent framework simulating dynamic medical interactions between \emph{Doctor} as player and NPCs including \emph{Patient}, \emph{Examiner}, \emph{Chief Physician}. This setup allows for realistic assessments of LLMs in clinical scenarios. We develop the Multi-View Medical Evaluation (MVME) benchmark, utilizing high-quality Chinese medical records and NPCs to evaluate LLMs' performance in symptom collection, examination recommendations, and diagnoses. Additionally, a dispute resolution collaborative mechanism is proposed to enhance diagnostic accuracy through iterative discussions. Despite improvements, current LLMs exhibit significant performance gaps in multi-turn interactions compared to one-step approaches. Our findings highlight the need for further research to bridge these gaps and improve LLMs' clinical diagnostic capabilities. Our data, code, and experimental results are all open-sourced at \url{https://github.com/LibertFan/AI_Hospital}.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Zhihao Fan (28 papers)
  2. Jialong Tang (17 papers)
  3. Wei Chen (1288 papers)
  4. Siyuan Wang (73 papers)
  5. Zhongyu Wei (98 papers)
  6. Jun Xi (3 papers)
  7. Fei Huang (408 papers)
  8. Jingren Zhou (198 papers)
Citations (4)
X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets

Youtube Logo Streamline Icon: https://streamlinehq.com