SARIMA Forecasting: Techniques & Applications

Updated 26 October 2025

SARIMA-based forecasting is a statistical framework that models both seasonal and nonseasonal dependencies using autoregressive, differencing, and moving average components.
It effectively captures periodic patterns in domains like economics, energy, health, and the environment through systematic parameter selection and residual diagnostics.
Hybrid models integrate SARIMA with nonlinear techniques (e.g., LSTM) to enhance predictive accuracy in scenarios with volatility and structural changes.

A Seasonal Autoregressive Integrated Moving Average (SARIMA) model is a foundational statistical framework for forecasting univariate time series exhibiting both non-seasonal and seasonal patterns. SARIMA extends ARIMA by incorporating explicit seasonal autoregressive, differencing, and moving average components, enabling effective modeling of periodic structures common in domains such as economics, energy, health, and the environment. While the SARIMA structure is highly interpretable, its linearity imposes practical limitations in datasets with nonlinearities, structural breaks, or unmodeled exogenous shocks. The following sections survey theoretical foundations, methodological variants, model selection, practical applications across sectors, benchmark comparisons, and ensemble/hybrid approaches, drawing on results from recent quantitative and methodological advances.

1. SARIMA Model Structure and Theoretical Foundations

SARIMA models are formulated to capture both short-term and seasonal dependencies via their general difference equation:

$\Phi_P(B^s)\,\phi_p(B)\,(1 - B)^d\,(1 - B^s)^D\, y_t = \Theta_Q(B^s)\,\theta_q(B)\, \epsilon_t$

where:

$y_t$ is the time series,
$B$ is the backshift operator,
$p$ (nonseasonal AR order), $q$ (nonseasonal MA order),
$P$ (seasonal AR order), $Q$ (seasonal MA order),
$d$ (nonseasonal differencing), $D$ (seasonal differencing),
$s$ is the seasonal span (e.g., $s=12$ for monthly data with yearly seasonality),
$\phi_p$ , $\theta_q$ are nonseasonal AR and MA polynomials, $\Phi_P$ , $\Theta_Q$ their seasonal counterparts,
$\epsilon_t$ is white noise.

Identification of appropriate orders $(p, d, q, P, D, Q, s)$ relies on autocorrelation (ACF), partial autocorrelation (PACF) analysis, and information criteria (AIC/AICc, BIC), with stationarity and invertibility established through differencing and root checks (Tewari, 2020, Andrade et al., 10 Mar 2025, Costa et al., 2020). Box–Jenkins methodology provides the canonical workflow for iterative model refinement, residual diagnostics via the Ljung–Box test, and parametric forecasting.

2. Methodological Innovations and Model Variants

Significant methodological advances have extended the SARIMA paradigm:

SARIMA with GARCH and Heavy-Tailed Innovations: For short-term load forecasting under high volatility, SARIMA-GARCH models enable conditional variance modeling, with innovations drawn from distributions such as Student-t or skew-normal to capture fat tails (Chandrarathna et al., 2020). This specification improves forecast accuracy during volatility spikes, as evidenced by an order-of-magnitude reduction in squared/absolute errors relative to normality-based ARIMA-GARCH benchmarks.
SARIMAX (SARIMA with Exogenous Variables): Regression terms allow exogenous variable integration (e.g., weather covariates in power demand (Eshragh et al., 2019), or hospitalization metrics in COVID-19 mortality (Toutiaee et al., 2021)). The typical formulation extends:

$\Phi_P(B^s)\phi_p(B)(1 - B)^d(1 - B^s)^D y_t = \Theta_Q(B^s)\theta_q(B)\epsilon_t + \sum_{j=1}^n \beta_j x_t^j$

Inclusion of contextual predictors can yield up to 46.3–64.6% reductions in forecast error, with models consistently outperforming both “crude” SARIMA and deep recurrent baselines (Eshragh et al., 2019, Toutiaee et al., 2021).

Rule-Based and Multilevel Seasonal SARMA: Custom SARMA extensions, such as the triple seasonal model for intra-day, intra-week, and intra-year periodicities in French load forecasting, use regime-specific lags and rules to capture special day or holiday effects (Arora et al., 2018). This explicit domain adaptation is shown to halve forecasting errors relative to conventional approaches.
SARIMA in Ensemble/Hybrid Models: SARIMA's capacity for linear-seasonal structure is coupled with nonlinear models (e.g., LSTM, MLP, Transformers) and multiresolution decomposition (VMD, MODWT), where each subcomponent is modeled with a method best adapted to its dynamics (Sun et al., 2020, Suna et al., 2020, Saikia et al., 15 Sep 2025, Sadabad et al., 23 Sep 2025, Boyeena et al., 16 Nov 2024). Performance gains in these settings are substantial, with ensemble RMSE and MAE typically 20–40% lower than single model or two-stage hybrid alternatives.

3. Model Selection, Evaluation, and Diagnostics

A rigorous SARIMA-based workflow features:

Stationarity Preprocessing: ADF or KPSS tests identify differencing requirements (Manayaga et al., 2019, Tewari, 2020, Tiwari et al., 2022).
Parameter Grid Search: Extensive parameter sweeps (e.g., evaluating 729 combinations (Tewari, 2020, Andrade et al., 10 Mar 2025)) ensure robust selection via AIC, MAPE, RMSE.
Residual Diagnostics: Residuals are examined for autocorrelation (Ljung–Box), normality (Shapiro–Wilk, Kolmogorov–Smirnov), and white-noise behavior to validate model adequacy (Manayaga et al., 2019, Hahn, 2023, Andrade et al., 10 Mar 2025).
Forecast Validation: Holdout or rolling-window testing assesses out-of-sample generalization; one-step and multi-step errors (MAPE, RMSE) quantify predictive accuracy across horizons.

These practices ensure empirical performance aligns with theoretical requirements (e.g., <2% MAPE up to 9 steps ahead for GDP (Costa et al., 2020), <1% MAPE for NIFTY 50 (Tewari, 2020), 0.72% MAPE for currency circulation (Andrade et al., 10 Mar 2025)).

4. Practical Applications Across Domains

SARIMA-based forecasting demonstrates broad impact in:

Energy and Power Demand: Univariate and hybrid SARIMA approaches deliver high-accuracy peak demand forecasts, especially when enhanced with weather regressors or volatility modeling (Eshragh et al., 2019, Chandrarathna et al., 2020). Periodic and triple-seasonal SARIMA models are also critical in load forecasting under anomalous calendar events (Arora et al., 2018).
Finance and Economics: SARIMA models feature in currency circulation (Andrade et al., 10 Mar 2025), stock index forecasting (Tewari, 2020), remittance prediction (Manayaga et al., 2019), and GDP projections (Costa et al., 2020), with robust handling of high-magnitude and seasonal regimes.
Epidemiology: SARIMA and SARIMAX are premier models in COVID-19 case/death forecasting (Toutiaee et al., 2021, Tiwari et al., 2022), outperforming alternative curve-fitting approaches (e.g., FBProphet) in RMSE, MAE, and policy-relevant accuracy metrics.
Environmental and Climate Analytics: Decomposed SARIMA components are integrated with nonlinear learners for rainfall (Saikia et al., 15 Sep 2025), tourist arrivals (Suna et al., 2020), and water resource variables, targeting both regular periodic modes and nonstationary, transient phenomena.
Infrastructure and Digital Services: SARIMA supports capacity planning for computational resources, with strength in long-horizon (multi-day) predictions where data is stationary and seasonal, though less robust under erratic or on/off streaming workloads (Nashold et al., 2020).

5. Comparative Performance and Hybridization with Machine Learning

While SARIMA delivers competitive performance in stable, seasonal, and linear contexts, limitations arise in the presence of nonlinearities, data-driven regime changes, or latent long-memory dynamics:

Performance Benchmarks: SARIMA often outperforms exponential smoothing (Holt–Winters), SVMs, and crude ARIMA in time series with strong, regular cycles (Costa et al., 2020, Tewari, 2020, Eshragh et al., 2019). However, in complex, nonlinear regimes, modern machine learning (neural networks, LSTM, Transformer) models provide superior accuracy, especially for short-term or highly nonstationary forecasting tasks (Adhikari et al., 2013, Nashold et al., 2020, Karami et al., 13 Oct 2025).
Ensemble Integration: Decomposition-based ensembles assign SARIMA to periodic/linear modes, reserving neural attention or LSTM for residual or nonlinear modes (Sun et al., 2020, Saikia et al., 15 Sep 2025, Boyeena et al., 16 Nov 2024). Quantitative validation shows holistic reductions in error metrics and bias-variance trade-offs, while hybrid models benefit from interpretability and computational efficiency in SARIMA blocks, and adaptability from the deep learning layers.
Feature Engineering and Driver Extraction: In energy price forecasting, SARIMA is used as a pre-processing tool to isolate innovations (shock residuals), enabling more informative regression and combination ensembles (Sadabad et al., 23 Sep 2025). This usage increases driver interpretability and reduces spurious multicollinearity.

6. Extensions, Limitations, and Research Directions

Incorporation of Exogenous Variables: The univariate SARIMA is often outperformed by SARIMAX, especially in systems influenced by weather, policy, or exogenous interventions (Toutiaee et al., 2021, Eshragh et al., 2019). A plausible implication is that integration of auxiliary indicators should be standard when contextual variables are available.
Anomaly Detection: Residual-based detection pipelines rely on accurate SARIMA forecasts; on simple periodic data, SARIMA may provide state-of-the-art anomaly signals at much lower computational cost than deep learning, but yields inferior recall/precision on complex or irregular patterns (Karami et al., 13 Oct 2025).
Handling of Nonlinearities and Structural Regime Shifts: Empirical studies suggest that pure SARIMA approaches underperform when faced with nonlinear growth (e.g., economic shocks, pandemic disruptions), calling for hybrid, ensemble, or regime-switching architectures (Adhikari et al., 2013, Costa et al., 2020, Andrade et al., 10 Mar 2025).
Multivariate and Copula-Integrated Forecasting: Advanced variants employ SARIMA for marginal time series modeling (e.g., mortality trends), using copulas to recover dependence structure across variables, notably in climate-mortality analytics (Barigou et al., 17 Oct 2025). Such designs yield improved joint simulation capability and risk forecasting for actuarial and environmental policy.
Rule-Based and Semantic Integration: In domains with structured calendar or external knowledge (e.g., electric load on special days in France), rule-based SARIMA/SARMA variants that inject expert knowledge achieve state-of-the-art performance (Arora et al., 2018). Similarly, combination of SARIMA forecasts with semantic ontologies enables more nuanced decision-support systems in health surveillance (Tiwari et al., 2022).

7. Summary Table: SARIMA-based Forecasting in Practice

Application Domain	SARIMA Variant	Notable Result/Metric
Power Demand	SARIMA, SARIMAX	46% reduction in MAPE with exogenous variables (Eshragh et al., 2019)
Epidemiology	SARIMAX	Up to 64.58% improvement in sMAPE vs. GCN-LSTM (Toutiaee et al., 2021)
Economic Indicators	SARIMA	<2% MAPE for 9-step GDP forecasts (Costa et al., 2020)
Financial Markets	SARIMA	0.9% MAPE, RMSE=139.67 NIFTY 50 (Tewari, 2020)
Environmental Hybrid	Wavelet-SARIMA-Transf.	Lowest RMSE/MAE, best agreement in rainfall (Saikia et al., 15 Sep 2025)
Energy Market Pricing	SARIMA-preprocess/VARX	Ensemble RMSE down to 12.28 with PCA-SS (Sadabad et al., 23 Sep 2025)

These results collectively underscore SARIMA's continued utility as an analytically transparent, computationally tractable core within contemporary forecasting pipelines, while highlighting the necessity of hybridization, domain-aware adaptations, and integration of auxiliary information to address the evolving complexity of real-world time series.