Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
120 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents (2405.02957v3)

Published 5 May 2024 in cs.AI

Abstract: The recent rapid development of LLMs has sparked a new wave of technological revolution in medical AI. While LLMs are designed to understand and generate text like a human, autonomous agents that utilize LLMs as their "brain" have exhibited capabilities beyond text processing such as planning, reflection, and using tools by enabling their "bodies" to interact with the environment. We introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness, in which all patients, nurses, and doctors are LLM-powered autonomous agents. Within the simulacrum, doctor agents are able to evolve by treating a large number of patient agents without the need to label training data manually. After treating tens of thousands of patient agents in the simulacrum (human doctors may take several years in the real world), the evolved doctor agents outperform state-of-the-art medical agent methods on the MedQA benchmark comprising US Medical Licensing Examination (USMLE) test questions. Our methods of simulacrum construction and agent evolution have the potential in benefiting a broad range of applications beyond medical AI.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (35)
  1. Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs. arXiv preprint arXiv:2402.11633 (2024).
  2. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
  3. Enhancing chat language models by scaling high-quality instructional conversations. arXiv preprint arXiv:2305.14233 (2023).
  4. S3: Social-network Simulation System with Large Language Model-Empowered Agents. arXiv:2307.14984 [cs.SI]
  5. War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars. arXiv:2311.17227 [cs.AI]
  6. Selfevolve: A code evolution framework via large language models. arXiv preprint arXiv:2306.02907 (2023).
  7. What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. arXiv:2009.13081 [cs.CL]
  8. Adaptive Collaboration Strategy for LLMs in Medical Decision Making. arXiv preprint arXiv:2404.15155 (2024).
  9. Agent. Hospital—health care applications of intelligent agents. Multiagent Engineering: Theory and Applications in Enterprises (2006), 199–220.
  10. Lanjuan Li and Hong Ren. 2013. Infectious Diseases (8 ed.). People’s Medical Publishing House.
  11. Large Language Model-Empowered Agents for Simulating Macroeconomic Activities. arXiv:2310.10436 [cs.AI]
  12. TradingGPT: Multi-Agent System with Layered Memory and Distinct Characters for Enhanced Financial Trading Performance. arXiv:2309.03736 [q-fin.PM]
  13. MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents. arXiv:2310.06500 [cs.AI]
  14. Can large language models reason about medical questions? arXiv:2207.08143 [cs.CL]
  15. Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine. arXiv:2311.16452 [cs.CL]
  16. Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022), 27730–27744.
  17. Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–22.
  18. Communicative agents for software development. arXiv preprint arXiv:2307.07924 (2023).
  19. Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research 21, 140 (2020), 1–67.
  20. Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems 36 (2024).
  21. Mastering the game of go without human knowledge. nature 550, 7676 (2017), 354–359.
  22. Learning by Self-Explaining. arXiv preprint arXiv:2309.08395 (2023).
  23. Principle-driven self-alignment of language models from scratch with minimal human supervision. Advances in Neural Information Processing Systems 36 (2024).
  24. Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537 (2023).
  25. Recagent: A novel simulation paradigm for recommender systems. arXiv preprint arXiv:2306.02552 (2023).
  26. Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 24824–24837. https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
  27. Epidemic Modeling with Generative Agents. arXiv:2307.04986 [cs.AI]
  28. Simulating Public Administration Crisis: A Novel Generative Agent-Based Simulation System to Lower Technology Barriers in Social Science Research. arXiv:2311.06957 [cs.CY]
  29. Exploring large language models for communication games: An empirical study on werewolf. arXiv preprint arXiv:2309.04658 (2023).
  30. Star: Bootstrapping reasoning with reasoning. Advances in Neural Information Processing Systems 35 (2022), 15476–15488.
  31. On Generative Agents in Recommendation. arXiv:2310.10108 [cs.IR]
  32. AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems. arXiv:2310.09233 [cs.IR]
  33. In-Context Principle Learning from Mistakes. arXiv:2402.05403 [cs.CL]
  34. CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents. arXiv:2310.17512 [cs.AI]
  35. LDB: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step. arXiv preprint arXiv:2402.16906 (2024).
Citations (43)

Summary

  • The paper introduces the MedAgent-Zero method, enabling LLM-powered doctor agents to iteratively enhance diagnostic and treatment accuracy.
  • It simulates the full hospital process, from triage to follow-up, using dynamic interactions between patient and professional agents.
  • Experimental results demonstrate state-of-the-art performance with improved accuracy and validated real-world transferability.

An Overview of "Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents"

"Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents" introduces an innovative approach to leveraging LLM agents within a simulated hospital environment, dubbed Agent Hospital. This simulation aims to meticulously model the entire medical process, encompassing disease onset, triage, registration, consultation, medical examination, diagnosis, treatment, recovery, and follow-up. A central objective of this paper is to validate the continuous self-improvement capabilities of medical agents, specifically doctor agents, through iterative interactions within this controlled environment, thereby contributing to advancements in LLM-powered applications for healthcare.

Simulation and Medical Agents

Agent Hospital is designed with distinct areas mimicking various hospital departments, such as triage stations, consultation rooms, and examination facilities. Two primary agent types inhabit this environment: medical professionals (doctors and nurses) and residents (potential patients). Medical professional agents are tasked with diagnosis and treatment planning, while residents simulate patients who interact with the hospital upon illness onset.

The foundational structure of Agent Hospital is augmented by a planning mechanism that ensures logical sequences of actions for both patient and medical professional agents. Patient agents experience and report symptoms, undergo triage, and proceed through the medical examination and treatment cycle. Conversely, medical professional agents continuously refine their diagnostic and therapeutic expertise through both practical engagement with patients and proactive learning from accumulated medical records and experience bases.

Methodology: MedAgent-Zero

To facilitate the evolution of doctor agents, this paper introduces a method called MedAgent-Zero. This methodology is distinguished by its parameter-free and data-independent nature, focusing on two key components: the Medical Record Library and the Experience Base. The Medical Record Library accumulates correct diagnostic and treatment records from previous interactions, serving as a reference for future decision-making. The Experience Base, on the other hand, compiles principles derived from diagnostic errors, which are subsequently validated and utilized to improve diagnostic accuracy. The interaction of medical professional agents with these components enables continuous self-improvement, akin to human learning from practical experience and reflections on errors.

Experimental Results

The evaluation of MedAgent-Zero encompasses simulation experiments with a dataset of generated patient records for eight representative respiratory diseases. Results reveal significant improvements in diagnostic accuracy over time, with doctor agents achieving accuracy rates of 88.0%, 95.6%, and 77.6% in examination, diagnosis, and treatment tasks, respectively. This iterative self-evolution is reflected in the increasing accuracy with which doctor agents handle simulated patients, a process more efficient than traditional medical training as it circumvents the time and resource constraints of human learning.

Additionally, the evolved doctor agents were evaluated on a real-world medical examination dataset, a subset of the MedQA dataset covering major respiratory diseases. The results corroborate the transferability of knowledge gained in the simulated environment to real-world medical benchmarks, with the evolved agents achieving state-of-the-art accuracy of 93.06%, surpassing human expert performance in some cases.

Implications and Future Directions

The implications of this paper are multifaceted. Practically, the results underscore the potential of LLM-powered agents to enhance medical training, diagnosis, and treatment in real-world settings without human intervention. Theoretically, this paper contributes to our understanding of how simulated environments can foster agent learning and adaptation, presenting a paradigm where agents continually evolve through structured interactions and feedback loops.

Looking forward, several avenues for future research emerge. Expanding the disease repertoire and medical scenarios within the simulation would further validate the scalability and versatility of MedAgent-Zero. Additionally, improving the underlying LLM models and optimizing the interaction mechanisms may enhance the efficiency and efficacy of medical agent training. Lastly, exploring applications beyond the medical field could generalize the principles of agent evolution through simulated environments, potentially benefiting other domains reliant on expert decision-making.

In conclusion, "Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents" presents a comprehensive framework for the continuous evolution of medical agents within a simulated environment. By seamlessly integrating practical patient interactions and reflective learning, the paper paves the way for significant advancements in the application of LLM-powered agents in healthcare and beyond.

Youtube Logo Streamline Icon: https://streamlinehq.com
Reddit Logo Streamline Icon: https://streamlinehq.com