Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
166 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Message Propagation Through Time: An Algorithm for Sequence Dependency Retention in Time Series Modeling (2309.16882v1)

Published 28 Sep 2023 in cs.LG

Abstract: Time series modeling, a crucial area in science, often encounters challenges when training Machine Learning (ML) models like Recurrent Neural Networks (RNNs) using the conventional mini-batch training strategy that assumes independent and identically distributed (IID) samples and initializes RNNs with zero hidden states. The IID assumption ignores temporal dependencies among samples, resulting in poor performance. This paper proposes the Message Propagation Through Time (MPTT) algorithm to effectively incorporate long temporal dependencies while preserving faster training times relative to the stateful solutions. MPTT utilizes two memory modules to asynchronously manage initial hidden states for RNNs, fostering seamless information exchange between samples and allowing diverse mini-batches throughout epochs. MPTT further implements three policies to filter outdated and preserve essential information in the hidden states to generate informative initial hidden states for RNNs, facilitating robust training. Experimental results demonstrate that MPTT outperforms seven strategies on four climate datasets with varying levels of temporal dependencies.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (25)
  1. Jared Willard et al. Integrating scientific knowledge with machine learning for engineering and environmental systems. ACM Computing Surveys, 2022.
  2. Christian Legaard et al. Constructing neural network based models for simulating dynamical systems. ACM Computing Surveys, 2023.
  3. Bryan Lim et al. Time-series forecasting with deep learning: a survey. Philos. Trans. R. Soc. A, 2021.
  4. Razvan Pascanu et al. On the difficulty of training recurrent neural networks. In ICML, 2013.
  5. Qingsong Wen et al. Transformers in time series: A survey. arXiv:2202.07125, 2022.
  6. Yoshua Bengio et al. The problem of learning long-term dependencies in recurrent networks. In IEEE international conference on neural networks, 1993.
  7. Shaoming Xu et al. Mini-batch learning strategies for modeling long term temporal dependencies: A study in environmental applications. In SIAM International Conference on Data Mining, 2023.
  8. Antonio Gulli et al. Deep learning with Keras. Packt Publishing Ltd, 2017.
  9. M Akin Yilmaz et al. Effect of architectures and training methods on the performance of learned video frame prediction. In ICIP, 2019.
  10. Steven Elsworth et al. Time series forecasting using lstm networks: A symbolic approach. arXiv:2003.05672, 2020.
  11. Alexander Katrompas et al. Enhancing lstm models with self-attention and stateful training. In SAI Intelligent Systems Conference, 2021.
  12. Léon Bottou et al. The tradeoffs of large scale learning. NeurIPS, 2007.
  13. Ronald J Williams et al. A learning algorithm for continually running fully recurrent neural networks. Neural computation, 1989.
  14. Samy Bengio et al. Scheduled sampling for sequence prediction with recurrent neural networks. NeurIPS, 2015.
  15. Henry Abarbanel. Predicting the future: completing models of observed complex systems. Springer, 2013.
  16. Henry DI Abarbanel et al. Machine learning: Deepest learning as statistical data assimilation problems. Neural computation, 2018.
  17. Henning U Voss et al. Nonlinear dynamical system identification from uncertain and indirect measurements. International Journal of Bifurcation and Chaos, 2004.
  18. Jonas Mikhaeil et al. On the difficulty of learning chaotic dynamics with rnns. NeurIPS, 2022.
  19. Sebastian Ruder. An overview of gradient descent optimization algorithms, 2017.
  20. Jeffrey G Arnold et al. Swat: Model use, calibration, and validation. Journal of the ASABE, 2012.
  21. Katrin Bieger et al. Introduction to swat+, a completely restructured version of the soil and water assessment tool. JAWRA, 2017.
  22. Kyunghyun Cho et al. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv:1406.1078, 2014.
  23. Frederik Kratzert et al. Caravan-a global community dataset for large-sample hydrology. Scientific Data, 2023.
  24. Frederik Kratzert et al. Towards learning universal, regional, and local hydrological behaviors via machine learning applied to large-sample datasets. Hydrology and Earth System Sciences, 2019.
  25. Martin Gauch et al. Rainfall–runoff prediction at multiple timescales with a single long short-term memory network. Hydrology and Earth System Sciences, 2021.
Citations (1)

Summary

We haven't generated a summary for this paper yet.