Agent Hospital: A Simulacrum of Hospital with Evolvable Medical Agents (2405.02957v1)
Abstract: In this paper, we introduce a simulacrum of hospital called Agent Hospital that simulates the entire process of treating illness. All patients, nurses, and doctors are autonomous agents powered by LLMs. Our central goal is to enable a doctor agent to learn how to treat illness within the simulacrum. To do so, we propose a method called MedAgent-Zero. As the simulacrum can simulate disease onset and progression based on knowledge bases and LLMs, doctor agents can keep accumulating experience from both successful and unsuccessful cases. Simulation experiments show that the treatment performance of doctor agents consistently improves on various tasks. More interestingly, the knowledge the doctor agents have acquired in Agent Hospital is applicable to real-world medicare benchmarks. After treating around ten thousand patients (real-world doctors may take over two years), the evolved doctor agent achieves a state-of-the-art accuracy of 93.06% on a subset of the MedQA dataset that covers major respiratory diseases. This work paves the way for advancing the applications of LLM-powered agent techniques in medical scenarios.
- Self-seeding and Multi-intent Self-instructing LLMs for Generating Intent-aware Information-Seeking dialogs. arXiv preprint arXiv:2402.11633 (2024).
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
- Enhancing chat language models by scaling high-quality instructional conversations. arXiv preprint arXiv:2305.14233 (2023).
- S3: Social-network Simulation System with Large Language Model-Empowered Agents. arXiv:2307.14984 [cs.SI]
- War and Peace (WarAgent): Large Language Model-based Multi-Agent Simulation of World Wars. arXiv:2311.17227 [cs.AI]
- Selfevolve: A code evolution framework via large language models. arXiv preprint arXiv:2306.02907 (2023).
- What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams. arXiv:2009.13081 [cs.CL]
- Adaptive Collaboration Strategy for LLMs in Medical Decision Making. arXiv preprint arXiv:2404.15155 (2024).
- Agent. Hospital—health care applications of intelligent agents. Multiagent Engineering: Theory and Applications in Enterprises (2006), 199–220.
- Lanjuan Li and Hong Ren. 2013. Infectious Diseases (8 ed.). People’s Medical Publishing House.
- Large Language Model-Empowered Agents for Simulating Macroeconomic Activities. arXiv:2310.10436 [cs.AI]
- TradingGPT: Multi-Agent System with Layered Memory and Distinct Characters for Enhanced Financial Trading Performance. arXiv:2309.03736 [q-fin.PM]
- MetaAgents: Simulating Interactions of Human Behaviors for LLM-based Task-oriented Coordination via Collaborative Generative Agents. arXiv:2310.06500 [cs.AI]
- Can large language models reason about medical questions? arXiv:2207.08143 [cs.CL]
- Can Generalist Foundation Models Outcompete Special-Purpose Tuning? Case Study in Medicine. arXiv:2311.16452 [cs.CL]
- Training language models to follow instructions with human feedback. Advances in neural information processing systems 35 (2022), 27730–27744.
- Generative agents: Interactive simulacra of human behavior. In Proceedings of the 36th Annual ACM Symposium on User Interface Software and Technology. 1–22.
- Communicative agents for software development. arXiv preprint arXiv:2307.07924 (2023).
- Exploring the limits of transfer learning with a unified text-to-text transformer. Journal of machine learning research 21, 140 (2020), 1–67.
- Reflexion: Language agents with verbal reinforcement learning. Advances in Neural Information Processing Systems 36 (2024).
- Mastering the game of go without human knowledge. nature 550, 7676 (2017), 354–359.
- Learning by Self-Explaining. arXiv preprint arXiv:2309.08395 (2023).
- Principle-driven self-alignment of language models from scratch with minimal human supervision. Advances in Neural Information Processing Systems 36 (2024).
- Medagents: Large language models as collaborators for zero-shot medical reasoning. arXiv preprint arXiv:2311.10537 (2023).
- Recagent: A novel simulation paradigm for recommender systems. arXiv preprint arXiv:2306.02552 (2023).
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models. In Advances in Neural Information Processing Systems, S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh (Eds.), Vol. 35. Curran Associates, Inc., 24824–24837. https://proceedings.neurips.cc/paper_files/paper/2022/file/9d5609613524ecf4f15af0f7b31abca4-Paper-Conference.pdf
- Epidemic Modeling with Generative Agents. arXiv:2307.04986 [cs.AI]
- Simulating Public Administration Crisis: A Novel Generative Agent-Based Simulation System to Lower Technology Barriers in Social Science Research. arXiv:2311.06957 [cs.CY]
- Exploring large language models for communication games: An empirical study on werewolf. arXiv preprint arXiv:2309.04658 (2023).
- Star: Bootstrapping reasoning with reasoning. Advances in Neural Information Processing Systems 35 (2022), 15476–15488.
- On Generative Agents in Recommendation. arXiv:2310.10108 [cs.IR]
- AgentCF: Collaborative Learning with Autonomous Language Agents for Recommender Systems. arXiv:2310.09233 [cs.IR]
- In-Context Principle Learning from Mistakes. arXiv:2402.05403 [cs.CL]
- CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents. arXiv:2310.17512 [cs.AI]
- LDB: A Large Language Model Debugger via Verifying Runtime Execution Step-by-step. arXiv preprint arXiv:2402.16906 (2024).
- Junkai Li (8 papers)
- Siyu Wang (55 papers)
- Meng Zhang (184 papers)
- Weitao Li (10 papers)
- Yunghwei Lai (3 papers)
- Xinhui Kang (1 paper)
- Weizhi Ma (43 papers)
- Yang Liu (2253 papers)