Structural Knowledge Informed Continual Multivariate Time Series Forecasting (2402.12722v1)
Abstract: Recent studies in multivariate time series (MTS) forecasting reveal that explicitly modeling the hidden dependencies among different time series can yield promising forecasting performance and reliable explanations. However, modeling variable dependencies remains underexplored when MTS is continuously accumulated under different regimes (stages). Due to the potential distribution and dependency disparities, the underlying model may encounter the catastrophic forgetting problem, i.e., it is challenging to memorize and infer different types of variable dependencies across different regimes while maintaining forecasting performance. To address this issue, we propose a novel Structural Knowledge Informed Continual Learning (SKI-CL) framework to perform MTS forecasting within a continual learning paradigm, which leverages structural knowledge to steer the forecasting model toward identifying and adapting to different regimes, and selects representative MTS samples from each regime for memory replay. Specifically, we develop a forecasting model based on graph structure learning, where a consistency regularization scheme is imposed between the learned variable dependencies and the structural knowledge while optimizing the forecasting objective over the MTS data. As such, MTS representations learned in each regime are associated with distinct structural knowledge, which helps the model memorize a variety of conceivable scenarios and results in accurate forecasts in the continual learning context. Meanwhile, we develop a representation-matching memory replay scheme that maximizes the temporal coverage of MTS data to efficiently preserve the underlying temporal dynamics and dependency structures of each regime. Thorough empirical studies on synthetic and real-world benchmarks validate SKI-CL's efficacy and advantages over the state-of-the-art for continual MTS forecasting tasks.
- Energy efficient smartphone-based activity recognition using fixed-point arithmetic. Journal of universal computer science 19, 9 (2013), 1295–1314.
- A public domain dataset for human activity recognition using smartphones.. In Esann, Vol. 3. 3.
- Adaptive graph convolutional recurrent network for traffic forecasting. Advances in Neural Information Processing Systems 33 (2020), 17804–17815.
- An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling. arXiv:1803.01271 (2018).
- Conditional time series forecasting with convolutional neural networks. arXiv preprint arXiv:1703.04691 (2017).
- Time series analysis: forecasting and control. John Wiley & Sons.
- Dark experience for general continual learning: a strong, simple baseline. Advances in neural information processing systems 33 (2020), 15920–15930.
- Spectral temporal graph neural network for multivariate time-series forecasting. Advances in Neural Information Processing Systems 33 (2020), 17766–17778.
- Freeway performance measurement system: mining loop detector data. Transportation Research Record 1748, 1 (2001), 96–102.
- Balanced Graph Structure Learning for Multivariate Time Series Forecasting. arXiv:2201.09686 [cs.LG]
- TrafficStream: A Streaming Traffic Flow Forecasting Framework Based on Graph Neural Networks and Continual Learning. ([n. d.]).
- Graph Deep Learning for Time Series Forecasting. arXiv preprint arXiv:2310.15978 (2023).
- Anne Denton. 2005. Kernel-density-based clustering of time series subsequences using a continuous random-walk noise model. In Fifth IEEE International Conference on Data Mining (ICDM’05). IEEE, 8–pp.
- Adarnn: Adaptive learning and forecasting of time series. In Proceedings of the 30th ACM international conference on information & knowledge management. 402–411.
- Multivariate time series forecasting with transfer entropy graph. Tsinghua Science and Technology 28, 1 (2022), 141–149.
- A methodology for energy multivariate time series forecasting in smart buildings based on feature selection. Energy and Buildings 196 (2019), 71–82.
- Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013).
- Silviu Guiasu and Abe Shenitzer. 1985. The principle of maximum entropy. The mathematical intelligencer 7 (1985), 42–48.
- Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33. 922–929.
- Continual learning for multivariate time series tasks with variable input dimensions. In 2021 IEEE International Conference on Data Mining (ICDM). IEEE, 161–170.
- Yujiang He and Bernhard Sick. 2021. CLeaR: An adaptive continual learning framework for regression tasks. AI Perspectives 3, 1 (2021), 1–16.
- Forecasting time series with varma recursions on graphs. IEEE Transactions on Signal Processing 67, 18 (2019), 4870–4885.
- A treatment engine by predicting next-period prescriptions. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1608–1616.
- Zixuan Ke and Bing Liu. 2022. Continual Learning of Natural Language Processing Tasks: A Survey. arXiv preprint arXiv:2211.12701 (2022).
- Mahdi Khodayar and Jianhui Wang. 2018. Spatio-temporal graph deep neural network for short-term wind speed forecasting. IEEE Transactions on Sustainable Energy 10, 2 (2018), 670–681.
- Reversible instance normalization for accurate time-series forecasting against distribution shift. In International Conference on Learning Representations.
- Overcoming catastrophic forgetting in neural networks. Proceedings of the national academy of sciences 114, 13 (2017), 3521–3526.
- Modeling long-and short-term temporal patterns with deep neural networks. In The 41st International ACM SIGIR Conference on Research & Development in Information Retrieval. 95–104.
- Time Series Forecasting with Hypernetworks Generating Parameters in Advance. arXiv preprint arXiv:2211.12034 (2022).
- Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926 (2017).
- Zhizhong Li and Derek Hoiem. 2017. Learning without forgetting. IEEE transactions on pattern analysis and machine intelligence 40, 12 (2017), 2935–2947.
- Rest: Reciprocal framework for spatiotemporal-coupled predictions. In Proceedings of the Web Conference 2021. 3136–3145.
- Parameter-free Dynamic Graph Embedding for Link Prediction. Advances in Neural Information Processing Systems 35 (2022), 27623–27635.
- iTransformer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv preprint arXiv:2310.06625 (2023).
- Multivariate Time-Series Forecasting with Temporal Polynomial Graph Neural Networks. In Advances in Neural Information Processing Systems.
- David Lopez-Paz and Marc’Aurelio Ranzato. 2017. Gradient Episodic Memory for Continual Learning. In NIPS.
- Data-driven short-term voltage stability assessment based on spatial-temporal graph convolutional network. International Journal of Electrical Power & Energy Systems 130 (2021), 106753.
- Helmut Lütkepohl. 2005. New introduction to multiple time series analysis. Springer Science & Business Media.
- Online non-linear topology identification from graph-connected time series. In 2021 IEEE Data Science and Learning Workshop (DSLW). IEEE, 1–6.
- Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 4602–4609.
- Learning time-varying graphs from online data. IEEE Open Journal of Signal Processing 3 (2022), 212–228.
- Expressing Multivariate Time Series as Graphs with Time Series Attention Transformer. arXiv preprint arXiv:2208.09300 (2022).
- A time series is worth 64 words: Long-term forecasting with transformers. arXiv preprint arXiv:2211.14730 (2022).
- Cgc: Contrastive graph clustering forcommunity detection and tracking. In Proceedings of the ACM Web Conference 2022. 1115–1126.
- Learning Fast and Slow for Online Time Series Forecasting. arXiv preprint arXiv:2202.11672 (2022).
- iCaRL: Incremental Classifier and Representation Learning. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016), 5533–5542.
- Experience replay for continual learning. Advances in Neural Information Processing Systems 32 (2019).
- Progressive Neural Networks. ([n. d.]).
- Chao Shang and Jie Chen. 2021. Discrete Graph Structure Learning for Forecasting Multiple Time Series. In Proceedings of International Conference on Learning Representations.
- Temporal pattern attention for multivariate time series forecasting. Machine Learning 108, 8 (2019), 1421–1441.
- Baochen Sun and Kate Saenko. 2016. Deep CORAL: Correlation Alignment for Deep Domain Adaptation. In ECCV Workshops.
- Xue Wang Liang Sun Rong Jin Tian Zhou, Peisong Niu. 2023. One Fits All: Power General Time Series Analysis by Pretrained LM. In NeurIPS.
- WaveNet: A Generative Model for Raw Audio. (2016).
- Deep learning for sensor-based activity recognition: A survey. Pattern recognition letters 119 (2019), 3–11.
- Adaptive data augmentation on temporal graphs. Advances in Neural Information Processing Systems 34 (2021), 1440–1452.
- Learning To Prompt for Continual Learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). 139–149.
- Timesnet: Temporal 2d-variation modeling for general time series analysis. arXiv preprint arXiv:2210.02186 (2022).
- Autoformer: Decomposition transformers with auto-correlation for long-term series forecasting. Advances in Neural Information Processing Systems 34 (2021), 22419–22430.
- Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 753–763.
- Spatial temporal graph convolutional networks for skeleton-based action recognition. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
- Learning the Evolutionary and Multi-Scale Graph Structure for Multivariate Time Series Forecasting. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining (Washington DC, USA) (KDD ’22). Association for Computing Machinery, New York, NY, USA, 2296–2306. https://doi.org/10.1145/3534678.3539274
- Spatio-temporal graph convolutional networks: a deep learning framework for traffic forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence. 3634–3640.
- Online topology identification from vector autoregressive time series. IEEE Transactions on Signal Processing 69 (2020), 210–225.
- Are transformers effective for time series forecasting?. In Proceedings of the AAAI conference on artificial intelligence, Vol. 37. 11121–11128.
- Stock price prediction via discovering multi-frequency trading patterns. In Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. 2141–2149.
- CGLB: Benchmark Tasks for Continual Graph Learning. In Thirty-sixth Conference on Neural Information Processing Systems Datasets and Benchmarks Track.
- Hierarchical prototype networks for continual graph representation learning. IEEE Transactions on Pattern Analysis and Machine Intelligence (2022).
- OneNet: Enhancing Time Series Forecasting Models under Concept Drift by Online Ensembling. In Thirty-seventh Conference on Neural Information Processing Systems.
- Fan Zhou and Chengtai Cao. 2021. Overcoming catastrophic forgetting in graph neural networks with experience replay. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 4714–4722.
- Fedformer: Frequency enhanced decomposed transformer for long-term series forecasting. In International Conference on Machine Learning. PMLR, 27268–27286.