Transformer Multivariate Forecasting: Less is More? (2401.00230v2)
Abstract: In the domain of multivariate forecasting, transformer models stand out as powerful apparatus, displaying exceptional capabilities in handling messy datasets from real-world contexts. However, the inherent complexity of these datasets, characterized by numerous variables and lengthy temporal sequences, poses challenges, including increased noise and extended model runtime. This paper focuses on reducing redundant information to elevate forecasting accuracy while optimizing runtime efficiency. We propose a novel transformer forecasting framework enhanced by Principal Component Analysis (PCA) to tackle this challenge. The framework is evaluated by five state-of-the-art (SOTA) models and four diverse real-world datasets. Our experimental results demonstrate the framework's ability to minimize prediction errors across all models and datasets while significantly reducing runtime. From the model perspective, one of the PCA-enhanced models: PCA+Crossformer, reduces mean square errors (MSE) by 33.3% and decreases runtime by 49.2% on average. From the dataset perspective, the framework delivers 14.3% MSE and 76.6% runtime reduction on Electricity datasets, as well as 4.8% MSE and 86.9% runtime reduction on Traffic datasets. This study aims to advance various SOTA models and enhance transformer-based time series forecasting for intricate data. Code is available at: https://github.com/jingjing-unilu/PCA_Transformer.
- A novel principal component analysis-informer model for fault prediction of nuclear valves. Machines, 10(4): 240.
- Conceptual and empirical comparison of dimensionality reduction algorithms (pca, kpca, lda, mds, svd, lle, isomap, le, ica, t-sne). Computer Science Review, 40: 100378.
- Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078.
- Feature selection and feature extraction in pattern analysis: A literature review. arXiv preprint arXiv:1905.02845.
- Overview of the Transformer-based Models for NLP Tasks. In 2020 15th Conference on Computer Science and Information Systems (FedCSIS), 179–183. IEEE.
- Finding structure with randomness: Probabilistic algorithms for constructing approximate matrix decompositions. SIAM review, 53(2): 217–288.
- A survey on vision transformer. IEEE transactions on pattern analysis and machine intelligence, 45(1): 87–110.
- Principal curves. Journal of the American Statistical Association, 84(406): 502–516.
- Long short-term memory. Neural computation, 9(8): 1735–1780.
- Hotelling, H. 1933. Analysis of a complex of statistical variables into principal components. Journal of educational psychology, 24(6): 417.
- Time-llm: Time series forecasting by reprogramming large language models. arXiv preprint arXiv:2310.01728.
- Multivariate time series forecasting with dynamic graph neural odes. IEEE Transactions on Knowledge and Data Engineering.
- A time series transformer based method for the rotating machinery fault diagnosis. Neurocomputing, 494: 379–395.
- A general framework for increasing the robustness of PCA-based correlation clustering algorithms. In Scientific and Statistical Database Management: 20th International Conference, SSDBM 2008, Hong Kong, China, July 9-11, 2008 Proceedings 20, 418–435. Springer.
- A principal component analysis of coagulation after trauma. The journal of trauma and acute care surgery, 74(5): 1223.
- Tts-gan: A transformer-based time-series generative adversarial network. In International Conference on Artificial Intelligence in Medicine, 133–143. Springer.
- iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv preprint arXiv:2310.06625.
- Non-stationary transformers: Exploring the stationarity in time series forecasting. Advances in Neural Information Processing Systems, 35: 9881–9893.
- Transformer-based conditional generative adversarial network for multivariate time series generation. arXiv preprint arXiv:2210.02089.
- Merola, G. M. 2015. Least squares sparse principal component analysis: A backward elimination approach to attain large loadings. Australian & New Zealand Journal of Statistics, 57(3): 391–429.
- A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730.
- Multi-view Kernel PCA for Time series Forecasting. arXiv preprint arXiv:2301.09811.
- Pearson, K. 1901. LIII. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin philosophical magazine and journal of science, 2(11): 559–572.
- Scikit-learn: Machine Learning in Python. Journal of Machine Learning Research, 12: 2825–2830.
- Nonlinear component analysis as a kernel eigenvalue problem. Neural computation, 10(5): 1299–1319.
- Bidirectional recurrent neural networks. IEEE transactions on Signal Processing, 45(11): 2673–2681.
- Scaleformer: iterative multi-scale refining transformers for time series forecasting. arXiv preprint arXiv:2206.04038.
- An implementation of a randomized algorithm for principal component analysis. arXiv preprint arXiv:1412.3510.
- Probabilistic decomposition transformer for time series forecasting. In Proceedings of the 2023 SIAM International Conference on Data Mining (SDM), 478–486. SIAM.
- Attention is all you need. Advances in neural information processing systems, 30.
- Transformers in time series: A survey. arXiv preprint arXiv:2202.07125.
- Whittle, P. 1951. Hypothesis testing in time series analysis. (No Title).
- TimesNet: Temporal 2D-Variation Modeling for General Time Series Analysis. In International Conference on Learning Representations.
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems, 34: 22419–22430.
- XIE, X. 2019. Principal component analysis.
- First de-trend then attend: Rethinking attention for time-series forecasting. arXiv preprint arXiv:2212.08151.
- Crossformer: Transformer utilizing cross-dimension dependency for multivariate time series forecasting. In The Eleventh International Conference on Learning Representations.
- GCformer: An Efficient Framework for Accurate and Scalable Long-Term Multivariate Time Series Forecasting. arXiv preprint arXiv:2306.08325.
- Informer: Beyond efficient transformer for long sequence time-series forecasting. In Proceedings of the AAAI conference on artificial intelligence, 11106–11115.
- One Fits All: Power General Time Series Analysis by Pretrained LM. arXiv preprint arXiv:2302.11939.