- The paper presents a novel online, adaptive multi-task framework using vector-valued HMMs to capture dynamic inter-entity correlations for probabilistic load forecasting.
- It achieves significant accuracy improvements with lower MAPE and RMSE, outperforming state-of-the-art methods on various public datasets while scaling efficiently with an increasing number of entities.
- The method provides joint predictive distributions with uncertainty quantification, enabling real-time adaptability and enhanced reliability in decentralized power grid management.
Adaptive Multi-task Learning for Probabilistic Load Forecasting: A Technical Overview
Problem Statement and Motivation
Accurate and reliable load forecasting is critical for modern power systems, especially as grids become increasingly decentralized and integrated with renewable resources. Forecasting the loads of multiple entities (e.g., regions, neighborhoods, buildings) with quantification of uncertainty and capturing dynamic inter-entity correlations is a challenging high-dimensional problem. Existing multi-task learning (MTL) methods for load forecasting are limited to offline, often static, models and lack the ability to adapt to time-varying patterns, thus failing to provide reliable online probabilistic forecasts. The paper "Adaptive Multi-task Learning for Probabilistic Load Forecasting" (2512.20232) addresses these shortcomings by proposing an online, adaptive, and probabilistic MTL framework based on vector-valued hidden Markov models (HMMs).
Proposed Method: Multi-Task Adaptive Probabilistic Load Forecasting
The core contribution is the Multi-APLF (Multi-task Adaptive Probabilistic Load Forecasting) framework, built on vector-valued HMMs tailored for joint modeling and prediction of the consumption processes of multiple entities. The method provides the following main advancements:
- Online Multi-task Learning: The technique enables recursive, sequential adaptation to new data via exponentially weighted log-likelihood maximization. Unlike prior offline MTL approaches, it incorporates recent information to track nonstationarity and dynamic correlations.
- Probabilistic Forecasts with Uncertainty Quantification: The method produces joint predictive distributions for all target entities, capturing both marginal variances and inter-entity covariances at each horizon.
- Scalable and Efficient Updates: The recursive parameter updates for means and covariance matrices are computationally efficient, scaling quadratically (or less) with the number of entities, and independent of the training set size. This enables application to large-scale, real-time load forecasting.
- Interpretability: The joint covariance structure explicitly models entity-wise uncertainty and correlations, which can be controlled via covariance sparsification to avoid spurious dependencies in low data regimes.
The HMM framework is parameterized such that, for each calendar type (hour-of-day, weekday, holiday), it maintains and adapts a set of transition and observation parameters. This allows utilizing all available exogenous information (e.g., weather, calendar) and integrating complex, time-dependent entity relationships.
Theoretical Guarantees and Implementation
The paper provides formal statements and proofs of the recursive learning and prediction steps. For each calendar type, the parameters are updated via recursive least squares with forgetting, providing adaptation to regime changes and sample-efficient updates. The theoretical results also show how future probabilistic forecasts for each entity are recursively derived, producing full joint Gaussian predictive distributions conditioned on latest observations and model parameters.
The implementation is detailed with pseudocode for both learning and prediction, and a publicly available Python codebase supports deployment in real-world scenarios. Computational and space complexity analyses demonstrate superiority over competing MTL approaches (e.g., MTGP, VAR, MLR) in terms of runtime scalability, especially as the number of entities and the prediction horizon grows.
The approach is evaluated on five public datasets spanning regional up to building-level load prediction, including GEFCom, ISO New England, PJM, Australian demand, and New South Wales building collections.
Across all datasets, Multi-APLF achieves lower RMSE and MAPE on all entities compared to the state-of-the-art, significantly outperforming both single-task (APLF, N-HiTS) and multi-task (MTGP, VAR, MLR) baselines. For example, on the GEFCom dataset, Multi-APLF improves mean MAPE from 6.48% (N-HiTS) and 6.38% (MTGP) to 4.61%, and mean RMSE from 0.15 GW (N-HiTS) and 0.15 GW (MTGP) to 0.11 GW. Similar improvements are observed consistently across all datasets and entities.
Probabilistic accuracy is also markedly improved: Multi-APLF delivers lower CRPS, pinball loss, and improved calibration error compared to MTGP and single-task probabilistic baselines. This demonstrates both better point predictions and a more realistic assessment of uncertainty, which is essential for risk-aware operational decision-making.
Multi-APLF exhibits robustness to delayed data and increased numbers of entities. Performance degrades gracefully in the presence of communication delays, outperforming offline competitors even with substantial lag. As the number of entities scales (e.g., to 40 or more), prediction error decreases due to more effective information sharing and cross-entity generalization—highlighting the method's suitability for high-dimensional settings.
Theoretical and Practical Implications
The work marks a significant advancement in multi-task load forecasting by bridging the gap between probabilistic modeling, online adaptation, and computational tractability. The combination of vector-valued HMMs, recursive parameter adaptation, and joint likelihood-based uncertainty quantification provides a unified approach that is both theoretically principled and practically scalable.
The methodology demonstrates that multi-entity joint modeling with recursive adaptation is not only necessary for scalable grid management, but also feasible with substantial gains in practical accuracy and reliability. The explicit modeling of inter-entity dependencies further enables novel applications for distributed control, scenario analysis, and risk management in smart grids.
Looking forward, future theoretical developments may include extension to hierarchical HMMs, non-linear observation models, and the integration of richer forms of exogenous information (e.g., market prices, distributed generation). On the practical side, the framework is well-suited to being embedded in real-time operational platforms, and could be expanded to handle extreme-event forecasting, grid flexibility management, and integration with reinforcement learning for autonomous grid control.
Conclusion
This paper presents a computationally efficient, online, and probabilistically robust multi-task learning paradigm for load forecasting in complex power systems (2512.20232). By leveraging vector-valued HMMs with recursive updates, the method achieves superior predictive and probabilistic accuracy, scales to large numbers of entities, and adapts in real-time to changing grid conditions. These advances provide both a practical tool for grid operators and a foundation for further research in adaptive, data-driven power system modeling.