Polaris: A Safety-focused LLM Constellation Architecture for Healthcare

Published 20 Mar 2024 in cs.AI and cs.CL | (2403.13313v1)

Abstract: We develop Polaris, the first safety-focused LLM constellation for real-time patient-AI healthcare conversations. Unlike prior LLM works in healthcare focusing on tasks like question answering, our work specifically focuses on long multi-turn voice conversations. Our one-trillion parameter constellation system is composed of several multibillion parameter LLMs as co-operative agents: a stateful primary agent that focuses on driving an engaging conversation and several specialist support agents focused on healthcare tasks performed by nurses to increase safety and reduce hallucinations. We develop a sophisticated training protocol for iterative co-training of the agents that optimize for diverse objectives. We train our models on proprietary data, clinical care plans, healthcare regulatory documents, medical manuals, and other medical reasoning documents. We align our models to speak like medical professionals, using organic healthcare conversations and simulated ones between patient actors and experienced nurses. This allows our system to express unique capabilities such as rapport building, trust building, empathy and bedside manner. Finally, we present the first comprehensive clinician evaluation of an LLM system for healthcare. We recruited over 1100 U.S. licensed nurses and over 130 U.S. licensed physicians to perform end-to-end conversational evaluations of our system by posing as patients and rating the system on several measures. We demonstrate Polaris performs on par with human nurses on aggregate across dimensions such as medical safety, clinical readiness, conversational quality, and bedside manner. Additionally, we conduct a challenging task-based evaluation of the individual specialist support agents, where we demonstrate our LLM agents significantly outperform a much larger general-purpose LLM (GPT-4) as well as from its own medium-size class (LLaMA-2 70B).

Abstract PDF HTML Upgrade to Chat

Authors (26)

First 10 authors:

References (64)

Citations (8)

View on Semantic Scholar

Summary

The paper introduces Polaris, a safety-focused LLM constellation architecture that integrates a primary conversational agent with specialist support agents for enhanced healthcare communications.
It employs co-training with diverse medical resources and simulated clinical scenarios to ensure accurate and patient-friendly multi-turn dialogues.
Evaluation with over 1100 nurses and 130 physicians shows Polaris achieves parity with human performance and outperforms models like GPT-4 in key clinical tasks.

Polaris: Advancing Healthcare Conversations with a Safety-focused LLM Constellation

Overview of Polaris

Polaris represents a significant step forward in the application of LLMs in the healthcare domain. This system introduces a constellation architecture of LLMs tailored specifically for real-time patient-AI healthcare conversations. Unlike its predecessors, Polaris emphasizes long multi-turn voice conversations, aiming to combine engaging, patient-friendly dialogue with medically accurate and safety-conscious interactions.

The system is built around a one-trillion-parameter constellation framework, encompassing a primary conversational agent and multiple specialist support agents. The primary agent is engineered for general conversation flow, while the specialists focus on healthcare-specific tasks, such as medication adherence and lab result interpretation, to enhance safety and reduce hallucinatory responses.

Training of these agents involves a sophisticated protocol that utilizes diverse objectives, leveraging resources like regulatory documents, medical manuals, and healthcare interaction data. By co-training these agents within a simulated environment of patient actors and licensed nurses, Polaris achieves a high degree of conversational alignment and medical reasoning capability.

Specialist Support Agents

Critical to Polaris's success are its specialist support agents, each designed for specific healthcare functions:

Privacy & Compliance Specialist: Ensures identity verification before discussing any Personal Health Information (PHI), addressing privacy and compliance concerns.
Checklist Specialist: Manages and navigates through complex care protocols to ensure all necessary topics are covered during a conversation.
Medication Specialist: Offers detailed support on medication adherence, contra-indications, and dosage verification, crucial for patient safety.
Labs & Vitals Specialist: Interprets lab results and vital signs within the context of the patient's health record, providing insight into changes and trends.
Nutrition Specialist: Gives tailored dietary advice based on the patient's health status and nutritional needs, particularly relevant for conditions like CHF and CKD.
Policy Specialist: Answers queries related to hospital and payor policies, utilizing a Retrieval-Augmented Generation (RAG) approach for up-to-date information.

This division of labor allows Polaris to allocate computational resources efficiently, reduce the primary agent's load, and ensure specialist tasks are performed with greater accuracy and safety.

Evaluation

Polaris underwent comprehensive evaluation, not only measuring its performance against human nurses but also comparing its capabilities to other general-purpose LLMs like GPT-4. Over 1100 licensed nurses and more than 130 physicians participated in end-to-end conversational assessments, with Polaris demonstrating parity with human nurses across several metrics including medical safety, clinical readiness, and patient education quality.

On specific healthcare tasks, Polaris substantially outperformed GPT-4 and LLaMA-2 70B, showcasing its effectiveness in medication adherence, lab result interpretation, and dietary recommendation accuracy. This underscores the advantage of Polaris's focused architectural design and training approach for healthcare conversations.

Future Directions

The development team behind Polaris is looking into enhancing the system with multi-call relationships for personalized care, improvements in support agent activation and communication, and the integration of multimodal modeling to enrich conversation dynamics. Also, the plan includes exploring asynchronous operation modes for support agents to further decrease response latency while maintaining conversational fluency and safety.

Conclusion

Polaris introduces a novel constellation architecture for LLMs in healthcare, focusing on safety, accuracy, and the patient experience in real-time conversations. By integrating specialized agents with distinct responsibilities, Polaris sets a new standard for AI in healthcare, aiming for impactful clinical improvements and operational efficiency. Through continuous development and rigorous testing, Polaris is poised to address the critical challenges of healthcare delivery with innovative AI solutions.

Markdown Report Issue