Short-term Load Forecasting with Deep Residual Networks (1805.11956v1)

Published 30 May 2018 in stat.ML, cs.LG, and stat.AP

Abstract: We present in this paper a model for forecasting short-term power loads based on deep residual networks. The proposed model is able to integrate domain knowledge and researchers' understanding of the task by virtue of different neural network building blocks. Specifically, a modified deep residual network is formulated to improve the forecast results. Further, a two-stage ensemble strategy is used to enhance the generalization capability of the proposed model. We also apply the proposed model to probabilistic load forecasting using Monte Carlo dropout. Three public datasets are used to prove the effectiveness of the proposed model. Multiple test cases and comparison with existing models show that the proposed model is able to provide accurate load forecasting results and has high generalization capability.

Citations (441)

View on Semantic Scholar

Summary

The paper introduces a deep residual network ensemble that significantly improves short-term load forecasting accuracy.
It combines domain knowledge with a modified ResNet architecture and ensemble strategies to overcome gradient vanishing and capture uncertainty.
Experimental results on public datasets show lower MAPE compared to traditional methods like linear regression and support vector regression.

Short-term Load Forecasting with Deep Residual Networks

The paper presents a novel approach to short-term load forecasting (STLF) leveraging deep residual networks (ResNet) to address the inherent challenges in predicting electric load in various time horizons. The authors propose an advanced model architecture integrating domain knowledge with deep learning frameworks to enhance both prediction accuracy and generalization capability. The paper is grounded in the backdrop of fluctuating power demands and the complexities born from the integration of renewable energy sources, making precise forecasting crucial for optimal management of modern power systems.

Model Architecture

The proposed model integrates a modified version of deep residual networks, which is pivotal to accommodate non-linearities present in load data. The deep residual network was chosen for its ability to overcome difficulties common in deep learning, such as gradient vanishing, through shortcut connections that permit efficient gradient flow. The architecture is further enriched by the introduction of an ensemble strategy, which combines various individual models at different training phases and parameters. This ensemble approach significantly contributes to enhancing generalization and stability in prediction outcomes.

Ensemble Strategy and Probabilistic Forecasting

A notable innovation in this work is the two-stage ensemble approach that aggregates models trained at distinct learning epochs, thereby capturing diverse aspects of the data distributions. Accompanied by Monte Carlo dropout techniques, the model extends its capabilities to probabilistic forecasting, providing not just point predictions but also capturing uncertainties in predictions. This probabilistic element is crucial for risk management in power system operations, allowing for more informed decision-making processes.

Evaluation and Results

Experiments conducted on three public datasets reveal the model's effectiveness, rigorously comparing its performance against various existing frameworks including linear regression and machine learning approaches like support vector regression (SVR) and extreme learning machines (ELM). The results indicate that the proposed model consistently achieves lower mean absolute percentage error (MAPE), demonstrating not only accuracy in forecasting but also robustness to input variability such as temperature fluctuations.

Implications and Future Directions

The implications of this work are twofold. Practically, the advanced STLF model can be seamlessly integrated into energy provider systems, enhancing load management, reducing costs, and supporting the infrastructure's transition towards smart grids. Theoretically, this paper paves the way for further exploration into deeper non-parametric models capable of transcending the limitations of traditional deterministic forecasting approaches.

To advance this research, future explorations can delve into incorporating additional neural network paradigms, such as convolutional networks or recurrent layers, offering potential gains in capturing temporal and sequential dependencies inherent in load data. Extending these methods to encompass real-time, adaptive forecasting models could further align short-term predictions with dynamic environmental and market changes, fostering the evolution of reliable, resilient energy systems.

PDF Markdown