- The paper introduces the Large Medical Model (LMM), a transformer model trained on patient event sequences using novel tokenization and Monte Carlo simulations to predict healthcare costs and chronic condition risks.
- In healthcare cost prediction, the LMM achieved a Normalized Mean Absolute Error (NMAE) of 78.3%, demonstrating a 14.1% improvement compared to leading commercial models.
- For chronic disease prediction, the LMM showed an average AUROC of 0.897, a 1.9% improvement over alternatives, and offers granular outputs capable of simulating clinical intervention impacts for various applications.
The paper presents a transformer‐based model—the Large Medical Model (LMM)—that leverages autoregressive next-token prediction on medical event sequences to forecast future healthcare costs and chronic condition risks. The work is firmly grounded in clinical informatics and large-scale healthcare analytics, combining a novel tokenization method with Monte Carlo simulations during inference to produce statistically robust, actionable predictions for patient care trajectories.
Technical Overview and Methodology
The LMM is trained on longitudinal claims data for over 140 million patients spanning 2016–2022. Key methodological innovations include:
- Custom Tokenization: Instead of using natural language representations, the model encodes structured medical events (e.g., ICD-10 CM diagnosis codes, CPT/HCPCS procedure codes, NDC drug identifiers, and cost and timing tokens) into a highly condensed vocabulary. This reduces token length dramatically compared to standard LLMs, thus preserving salient clinical signals and decreasing computational overhead.
- Autoregressive Transformer Architecture: Leveraging an approach akin to GPT (Generative Pre-trained Transformer), the model is trained on next-token prediction, which naturally captures the temporal progression of clinical events in patient histories.
- Monte Carlo Simulation during Inference: By generating multiple (64 in experiments) alternative future sequences for each patient, the LMM derives a probability distribution over future events. This simulation-based approach facilitates the estimation of conditional probabilities, enabling insights into potential causal relationships between clinical events.
Experimental Evaluation
Care Cost Prediction
The cost prediction component is evaluated on a cohort that adheres to the inclusion/exclusion criteria of a Society of Actuaries (SoA) paper. Two primary metrics are employed:
- Normalized Mean Absolute Error (NMAE): The LMM achieves a NMAE of 78.3%, representing a 14.1% improvement over the best commercial models previously reported. Here, NMAE is computed as
- yi is the actual cost,
- y^i is the predicted cost, and
- n is the number of observations.
- R-Squared: The model also attains an R-squared of 25.3%, a 2.02 percentage point increase over prior best performers. The coefficient of determination is given by
R2=1−∑i=1n(yi−y)2∑i=1n(yi−y^i)2
with y representing the mean actual cost.
This rigorous evaluation against models such as those from Cotiviti (DxCG), Hopkins (ACG), Milliman (MARA), and the US Government HHS-HCC demonstrates improved predictive accuracy and interpretability, particularly in stratifying high-cost patients.
Chronic Disease Prediction
The LMM is also benchmarked on the prediction of 19 chronic conditions by aligning conditions drawn from the Chronic Conditions Data Warehouse (CCW) with established clinical taxonomies. Performance is quantified via:
- AUROC (Area Under the Receiver Operating Characteristic Curve): The LMM achieves an average AUROC of 0.897, a 1.9% improvement relative to a transformer-based alternative (BEHRT). Notably, for conditions such as diabetes (0.95 vs. 0.81), atrial fibrillation and flutter, dyslipidemia, and hypertension, the LMM shows markedly higher discriminative power.
- AUPRC (Area Under the Precision-Recall Curve): Although not uniformly reported across every condition, the complementary use of AUPRC lends further nuance to evaluation, particularly in the presence of class imbalances inherent in rare disease prediction.
Additional Innovations and Use Cases
Beyond standard forecasting, the LMM offers capabilities that enhance its clinical utility:
- Simulation of Clinical Interventions: By appending a new event (e.g., a stroke) to a patient’s history, the model can simulate the downstream impact on future event probabilities. An example provided in the paper includes a differential prediction of Parkinson’s-related events between genders following a stroke, aligning with epidemiological data (e.g., a 1.5× higher likelihood in males).
- Granular, Actionable Outputs: Instead of providing a single point estimate, the LMM generates full event sequences. This allows for detailed insight into not only anticipated cost accumulation over time but also the specific procedures, diagnoses, and interventions that contribute to risk stratification.
Potential applications span multiple domains in healthcare:
- Population Health Management: Stratification of patients based on predicted cost and intervention needs.
- Prior Authorization and Financial Forecasting: Improved risk adjustment and cost prediction to inform underwriting and stop-loss insurance decisions.
- In-Silico Research: Rapid simulation of the impact of hypothetical interventions, thereby guiding the design of future clinical studies without the immediate need for in-vivo testing.
Conclusion
The presented work advances the state-of-the-art in healthcare analytics by rigorously integrating transformer-based approaches with a specialized clinical representation of data. The improvement of 14.1% in cost prediction error (NMAE) and a 1.9% boost in chronic disease prediction (AUROC) underscore its potential in reshaping predictive models in healthcare. The model’s nuanced simulation of patient trajectories and capability to generate actionable, detailed event sequences positions it as a significant tool for personalized medicine, risk management, and operational decision-making in clinical settings.