Electricity Price Forecasting (EPF)

Updated 15 September 2025

Electricity Price Forecasting (EPF) is the quantitative prediction of future prices using models that account for seasonality, autocorrelation, and abrupt market changes.
It integrates statistical, econometric, and machine learning techniques, including hybrid and supply-demand curve based models, to address market structure and exogenous influences.
Recent advances emphasize probabilistic forecasting, deep learning ensembles, and adaptive retraining to enhance accuracy and capture complex dependencies in electricity markets.

Electricity price forecasting (EPF) refers to the quantitative prediction of future electricity prices across multiple market horizons, employing statistical, econometric, and machine learning methods. EPF is central to electricity market operation, trading, risk management, and policy design, with applications spanning from real-time to months- or years-ahead forecasts. The domain is characterized by the unique statistical and economic properties of electricity prices—strong seasonality, pronounced autocorrelation, regime shifts, price spikes, and strong dependencies on exogenous factors such as load, renewable generation, and fuel costs. Recent research focuses on higher accuracy, probabilistic forecasting, interpretability, and robust adaptation to market structural changes.

1. Foundational Modeling Approaches

Two primary modeling paradigms have historically dominated EPF:

Statistical/Econometric Models Autoregressive models with exogenous variables (ARMAX, VARX), regime-switching models, and regularized regressions (notably LASSO/LEAR) have been widely used. These models typically assume a linear dependence of price on lagged prices and exogenous variables such as load and renewables, with explicit modeling of short-term autocorrelation and periodic effects. Mathematical representation typically includes:

$X_t = \epsilon_t + \sum_{i=1}^p \phi_i X_{t-i} + \sum_{i=1}^q \theta_i \epsilon_{t-i} + \sum_{i=0}^b \eta_i d_t(i)$

where $d_t(i)$ denotes exogenous regressors such as load (Barta et al., 2015). LASSO-based models (LEAR) perform automatic variable selection, enabling parsimonious yet high-dimensional frameworks (Lago et al., 2020, Mascarenhas et al., 2022).

Machine Learning Models Recent years have seen the adoption of deep learning architectures—feed-forward DNN, GRU/LSTM, convolutional, and, most recently, Transformer models—jointly predicting prices for all hours and capturing nonlinearities and complex feature interactions (Jędrzejewski et al., 2022, Marcjasz et al., 2022, Llorente et al., 24 Mar 2024). These models are capable of extracting seasonality, autocorrelation, and cross-hour dependencies, especially when combined with expert feature engineering or attention mechanisms. Gradient boosting regression trees have also demonstrated state-of-the-art performance in modeling nonlinear relationships and intrinsic feature selection (Barta et al., 2015).

2. Integration of Economic, Physical, and Market Structure Priors

Advanced EPF frameworks increasingly integrate economic and physical market structure:

Supply-Demand (Auction Curve) Approaches Rather than modeling price directly, fundamental models such as the X-Model forecast the aggregated sale and purchase curves (supply/demand) and derive the price as the intersection point. This enables explicit treatment of bidding behavior and market clearing mechanisms (Ziel et al., 2015). Price modeling is thus formulated as:

$\breve{S}_{d,h}(P) = \sum_{p \leq P, p \in \breve{\mathcal{P}}_S} \breve{V}_{S,d,h}(p)$

and the market price is identified at the intersection of reconstructed supply and demand curves.

Hybrid Fundamental-Statistical Models Hybrid frameworks combine a techno-economic system model (e.g., minimization of system operating cost subject to constraints for generation, storage, and transmission) with stochastic/statistical post-processing (quantile regression, SARMA) to correct for unmodeled speculative behavior or systematic errors (Watermeyer et al., 2023). Coefficient constraints (e.g., merit-order constraints on fuel/CO₂ regressor coefficients) and structured approaches embedding intermediate quantity forecasts (of load, renewables, cross-border flow) improve both physical consistency and robustness to regime shifts (Sgarlato, 2023, Ghelasi et al., 1 Jun 2024, Wang et al., 20 May 2024).

3. Probabilistic Forecasting and Quantifying Uncertainty

A key trend is a shift from deterministic point forecasts to full probabilistic and path (scenario-based) forecasting:

Quantile Regression and Distributional Outputs Probabilistic forecasts employ quantile regression (QRA) to generate predictive intervals based on multiple point or quantile forecasts. Advanced DNN architectures now use probability layers outputting full parametric distributions (e.g., normal, Johnson’s SU), enabling forecasts to capture location, scale, skewness, and kurtosis (Marcjasz et al., 2022).
Simulations and Scenario Generation Models such as the X-Model (and its long-term extension) use bootstrapped simulations of key random market drivers (physical variables, bid formation) to generate ensembles of price paths, from which prediction intervals and probabilities of regulatory or operational interest (such as six consecutive hours of negative prices under EEG) are derived (Ziel et al., 2017).
Ensemble and Generative Models Implicit generative ensemble post-processing (IGEP) frameworks produce realizations of the joint predictive distribution, reflecting dependencies between hourly prices without assuming parametric marginals or resorting to copula constructions (Janke et al., 2020). Adaptive post-processing using conformal prediction and online aggregation further enhances coverage validity and sharpness, particularly in turbulent markets (Dutot et al., 24 May 2024).

4. Variable Selection, High-Dimensionality, and Model Structure

Model design for EPF is often characterized by:

High-Dimensional Feature Spaces Forecasting frameworks may employ hundreds of autoregressive, periodic, and exogenous variables, with regularization (LASSO/HQC) ensuring parsimony and mitigating overfitting (Ziel et al., 2018, Lago et al., 2020). Analysis of variable selection reveals the recurring importance of recent lags (lag = 1, lag = 24), cross-hour effects (e.g., previous day’s last hour on early morning prices), and seasonal dummies.
Univariate vs. Multivariate Designs Both univariate and multivariate regression frameworks have been extensively compared. Results indicate that optimal structure is context- and hour-dependent, and that forecast combination (simple averaging of leading univariate/multivariate models) tends to improve overall MAE and robustness (Ziel et al., 2018).
Explicit Incorporation of Exogenous Predictors The rising share of renewables necessitates accurate forecasts and explicit inclusion of wind, solar, and load forecasts as covariates. Recent work demonstrates that significant further accuracy gains accrue from using dense grids of quantile (probabilistic) forecasts of these exogenous drivers, not only their point estimates (Uniejewski et al., 10 Jan 2025). New explanatory features of high predictive power (e.g., nuclear availability in France) further contribute, particularly in regimes with shifting supply composition (Dutot et al., 24 May 2024).

5. Evaluation, Benchmarks, and Best Practices

Rigorous assessment protocols are imperative:

Benchmark Datasets and Toolboxes To address deficiencies in EPF evaluation (short or unique test periods, absence of open data), benchmark datasets spanning multiple years and markets, with recalibration and open-source toolboxes (e.g., epftoolbox), now enable reproducible, fair comparisons and automated feature selection, calibration, and statistical testing (Lago et al., 2020).
Accuracy and Economic Measures Performance is quantified with MAE, rMAE (relative to a naive benchmark), RMSE, SMAPE, and economically meaningful metrics (such as profit/cost impacts and risk indices). Statistical significance is routinely evaluated using Diebold-Mariano and Giacomini-White tests, often at both hourly and 24-hour vectorized loss levels.
Model Robustness and Adaptation Recent benchmarking indicates that while deep learning and state-of-the-art “foundation” time series models (Chronos-Bolt, Time-MoE, TimesFM, TimeGPT) can achieve high accuracy, robust biseasonal statistical models (such as MSTL) remain strong or dominant, especially in day-ahead auction settings (Sartipi et al., 9 Jun 2025). Further, no single model is uniformly superior; adaptive aggregation and retraining strategies are recommended, especially across periods of dynamic market structure.

6. Trends, Innovations, and Open Research Questions

Recent and emerging themes include:

Transfer Learning and Cross-Market Generalization Transfer learning via source market pretraining and fine-tuning improves forecast accuracy in data-limited settings and enables models to generalize across structurally related markets (Gunduz et al., 2020).
Hybrid and Structured Modeling Embedding domain knowledge—either through hybrid techno-economic/statistical models or explicit structural decomposition (e.g., intermediate forecasts of load/renewables and monotonic supply curve estimation)—enables improved interpretability and performance, particularly under data constraints or structural breaks (Sgarlato, 2023, Watermeyer et al., 2023, Ghelasi et al., 1 Jun 2024).
Handling Market Regime Shifts and Extreme Events Structured and probabilistic models (with conformal calibration or explicit simulation) yield better reliability under rare events, price spikes, and unprecedented regimes. However, challenges remain in fully capturing the causal impact of renewables and exogenous variables during such events (Mascarenhas et al., 2022, Ziel et al., 2017).
Path/Ensemble Forecasts and Decision Support There is a growing emphasis on scenario-based outputs for risk-sensitive applications such as bidding, scheduling, and grid balancing. The move toward providing full trajectory ensembles and probabilistic paths is motivated by the nonlinearity of operational costs under different supply-demand realizations (Maciejowska et al., 2022, Ziel et al., 2017).

References Table: Representative Model Classes and Methods

Model/Method Class	Core Principle	Notable References
ARMAX/LEAR/LASSO	Linear regression, regularized	(Barta et al., 2015, Lago et al., 2020, Mascarenhas et al., 2022)
Gradient Boosting	Ensemble nonlinear regression	(Barta et al., 2015)
DNN, LSTM, GRU, Transformer	Deep learning, nonlinear, sequence modeling	(Jędrzejewski et al., 2022, Marcjasz et al., 2022, Llorente et al., 24 Mar 2024, Rezaei et al., 2022)
Supply-Demand Curve Models (X-Model)	Auction curve, fundamental modeling	(Ziel et al., 2015, Ziel et al., 2017)
Hybrid Statistical-Fundamental	Economic + stochastic correction	(Watermeyer et al., 2023, Sgarlato, 2023, Ghelasi et al., 1 Jun 2024, Wang et al., 20 May 2024)
Probabilistic & Scenario-based	Quantile/distributional, ensemble/post-processing	(Marcjasz et al., 2022, Janke et al., 2020, Uniejewski et al., 10 Jan 2025, Dutot et al., 24 May 2024)
Pretrained/Foundational TSFM	Large pre-trained temporal models	(Sartipi et al., 9 Jun 2025)

Summary and Outlook

Electricity price forecasting has rapidly evolved from expert-driven linear techniques to a diverse toolset encompassing structured, probabilistic, and hybrid models, with explicit economic and market mechanics integration. The current frontiers include dense probabilistic input modeling, structural domain-informed regression, foundation models, and robust adaptation to market shocks. Despite the advances, interpretability, adaptation to new regimes, and explicit uncertainty quantification remain central challenges. The field benefits from common benchmarks, advanced evaluation protocols, and open-source tools, ensuring continued methodological innovation and improved operational reliability for real-world electricity markets.