- The paper introduces a novel ARIMA-ANN hybrid model integrated with EMD that reduces forecasting errors by 23% to 89% across varied datasets.
- The methodology flexibly decomposes time series into linear and nonlinear components by applying ARIMA on stationary data and ANN on complex patterns.
- Empirical tests on datasets like Wolf’s sunspot, currency exchange, and electricity market prices demonstrate the model’s robustness and enhanced accuracy.
Improving Forecasting Accuracy of Time Series Data Using a New ARIMA-ANN Hybrid Method and Empirical Mode Decomposition
This paper presents a novel approach to time series forecasting by introducing a hybrid method that leverages the strengths of both linear and nonlinear modeling techniques. Specifically, the authors propose a combination of Autoregressive Integrated Moving Average (ARIMA) and Artificial Neural Networks (ANNs), enhanced by Empirical Mode Decomposition (EMD), to improve forecasting accuracy for varied datasets.
Overview of the Approach
The core contribution of this research is the development of an ARIMA-ANN hybrid model that overcomes several limitations of traditional methods. Traditional hybrid models often make stringent assumptions regarding the decomposition and combination of linear and nonlinear components in time series data, which can limit their applicability across different domains. The proposed method addresses these challenges by:
- Flexible Decomposition: The method begins by decomposing the time series data using a moving average (MA) filter to separate it into linear and nonlinear components, avoiding the assumption that ARIMA output is the linear component.
- Adaptive Hybridization: After decomposition, ARIMA is applied to the stationary linear component, while ANN is employed to capture nonlinear patterns and combine the forecasted outputs. This flexible hybridization avoids assuming additive relationships, allowing for more complex interactions between components.
- Enhanced Stationarity with EMD: The method further integrates EMD to process time series into more stationary subseries (IMFs), which are individually modeled to capitalize on their increased regularity and reduced volatility. This additional decomposition step substantially improves forecasting accuracy, particularly for volatile and non-stationary series.
Experimental Validation and Results
The authors validate their approach using four datasets, including the well-known Wolf’s sunspot data, the Canadian lynx data, the British pound/US dollar exchange rate data, and a dataset of Turkey's Intraday Electricity Market Prices. The datasets provide a diverse set of challenges, from trends and seasonality to volatility, which test the robustness of the proposed model.
The empirical results show that:
- The novel hybrid method significantly outperforms individual ARIMA and ANN models, as well as existing hybrids by Zhang, Khashei-Bijari, and Babu-Reddy, across all datasets tested.
- The integration of EMD further enhances accuracy, with percentage improvements in Mean Absolute Error (MAE), Mean Squared Error (MSE), and Mean Absolute Scaled Error (MASE) across datasets ranging from 23% to 89%.
- The improvements are particularly pronounced in highly volatile datasets, showcasing the proposed method’s ability to handle complex, non-linear patterns effectively.
Implications and Future Directions
The research demonstrates the efficacy of combining linear and nonlinear methods in a flexible hybrid architecture, contributing to the body of knowledge on time series forecasting. The approach’s adaptability to diverse data characteristics makes it a valuable tool for a wide range of practical applications, from financial markets to energy forecasting.
The use of EMD highlights the potential for multiscale decomposition techniques to enhance model performance by increasing data regularity. Future research could explore integrating other decomposition techniques with hybrid models or extending this methodology to multi-step forecasting scenarios. Additionally, investigating the method's performance over larger and more heterogeneous datasets could further establish its robustness and practicality.
Overall, this work contributes to refining time series forecasting capabilities, yielding better insights and decision-making support across various fields.