Personalized Forecasting of Glycemic Control in Type 1 and 2 Diabetes Using Foundational AI and Machine Learning Models

Published 2 Jan 2026 in q-bio.OT | (2601.00613v1)

Abstract: Background: Accurate week-ahead forecasts of continuous glucose monitoring (CGM) derived metrics could enable proactive diabetes management, but relative performance of modern tabular learning approaches is incompletely defined. Methods: We trained and internally validated four regression models (CatBoost, XGBoost, AutoGluon, tabPFN) to predict six weekahead CGM metrics (TIR, TITR, TAR, TBR, CV, MAGE, and related quantiles) using 4,622 case-weeks from two cohorts (T1DM n=3,389; T2DM n=1,233). Performance was assessed with mean absolute error (MAE) and mean absolute relative difference (MARD); quantile classification was summarized via confusion-matrix heatmaps. Results: Across T1DM and T2DM, all models produced broadly comparable performance for most targets. For T1DM, MARD for TIR, TITR, TAR and MAGE ranged 8.5 to 16.5% while TBR showed large MARD (mean ~48%) despite low MAE. AutoGluon and tabPFN showed lower MAE than XGBoost for several targets (e.g., TITR: p<0.01; TAR/TBR: p<0.05 to 0.01). For T2DM MARD ranged 7.8 to 23.9% and TBR relative error was ~78%; tabPFN outperformed other models for TIR (p<0.01), and AutoGluon/ tabPFN outperformed CatBoost/XGBoost on TAR (p<0.05). Inference time per 1,000 cases varied markedly (PFN 699 s; AG 2.7 s; CatBoost 0.04 s, XGBoost 0.04 s). Conclusions: Week-ahead CGM metrics are predictable with reasonable accuracy using modern tabular models, but low-prevalence hypoglycemia remains difficult to predict in relative terms. Advanced AutoML and foundation models yield modest accuracy gains at substantially higher computational cost.