Probabilistic Forecasting with Temporal Convolutional Neural Network (1906.04397v3)

Published 11 Jun 2019 in stat.ML and cs.LG

Abstract: We present a probabilistic forecasting framework based on convolutional neural network for multiple related time series forecasting. The framework can be applied to estimate probability density under both parametric and non-parametric settings. More specifically, stacked residual blocks based on dilated causal convolutional nets are constructed to capture the temporal dependencies of the series. Combined with representation learning, our approach is able to learn complex patterns such as seasonality, holiday effects within and across series, and to leverage those patterns for more accurate forecasts, especially when historical data is sparse or unavailable. Extensive empirical studies are performed on several real-world datasets, including datasets from JD.com, China's largest online retailer. The results show that our framework outperforms other state-of-the-art methods in both accuracy and efficiency.

Authors (4)

Yitian Chen (20 papers)
Yanfei Kang (26 papers)
Yixiong Chen (22 papers)
Zizhuo Wang (24 papers)

Citations (284)

View on Semantic Scholar

Summary

The paper presents a novel TCN-based framework that integrates dilated causal convolutions and residual networks to capture long-term temporal dependencies.
The method models both Gaussian and quantile-based distributions, outperforming benchmarks like SARIMA and DeepAR-t across various datasets.
The framework incorporates exogenous variables effectively, enabling scalable industrial applications and inspiring future research in diverse domains.

Overview of Probabilistic Forecasting with Temporal Convolutional Neural Network

The paper "Probabilistic Forecasting with Temporal Convolutional Neural Network" presents a convolutional neural network (CNN) framework for probabilistic forecasting, specifically designed for multiple related time series. This framework stands out for its robustness in handling challenges typical in real-world forecasting scenarios, such as data sparsity and cold-start problems, while maintaining high flexibility and scalability.

Methodological Details

The proposed framework integrates dilated causal convolutional networks and residual neural networks to capture temporal dependencies in time series data effectively. The framework models both parametric and non-parametric probabilistic distributions, accommodating dynamic regression models similar to ARIMAX.

Key Components

Encoder Module: The encoder utilizes dilated causal convolutions structured in residual blocks, which enhances the network's ability to capture long-term dependencies. Each residual block applies batch normalization and rectified linear unit (ReLU) activations that improve training efficiency and prediction accuracy.
Decoder Module: Combining features from historical observations with those from exogenous variables through a variant of the residual network (resnet-v) enables the model to seamlessly apply non-linear transformations, resulting in better adaptation to complex patterns and dynamic changes in the time series.
Probabilistic Framework: The framework supports both parametric and non-parametric forecasting approaches. A Gaussian distribution serves as the basis for a parametric approach via maximum likelihood estimation, while quantile regression drives the non-parametric version, allowing for distribution-free robustness in estimating diverse probability densities.

Performance

The empirical evaluation encompasses multiple datasets, including those from industrial contexts such as demand and shipment forecasting for JD.com, and specifically designed public datasets like electricity and traffic data. DeepTCN was shown to consistently outperform established benchmarks such as SARIMA and lightGBM, as well as state-of-the-art deep learning approaches like DeepAR- $t$ and SQF-RNN, across various accuracy metrics for both point and probabilistic forecasting.

Implications and Future Directions

The DeepTCN framework's demonstrated superiority and adaptability suggest its viability for large-scale industrial applications where high-dimensional time series data is prevalent. The model's ability to incorporate exogenous variables presents a significant advantage, particularly in scenarios where external factors heavily influence future behavior, such as prices or promotional events in retail forecasting.

Future research could explore the application of DeepTCN in other domains such as healthcare, finance, and environmental science, each presenting unique challenges in data dependency and external influences. Additionally, advancements could focus on extending the current structure to integrate other probabilistic models, potentially improving forecast understanding under diverse data conditions.

In summary, the paper contributes a comprehensive CNN-based probabilistic forecasting framework adept at learning from complex, multi-series datasets, significantly advancing the predictive capabilities in scenarios with varying data density and external influence considerations.

PDF Markdown