DynST: Dynamic Sparse Training for Resource-Constrained Spatio-Temporal Forecasting (2403.02914v2)
Abstract: The ever-increasing sensor service, though opening a precious path and providing a deluge of earth system data for deep-learning-oriented earth science, sadly introduce a daunting obstacle to their industrial level deployment. Concretely, earth science systems rely heavily on the extensive deployment of sensors, however, the data collection from sensors is constrained by complex geographical and social factors, making it challenging to achieve comprehensive coverage and uniform deployment. To alleviate the obstacle, traditional approaches to sensor deployment utilize specific algorithms to design and deploy sensors. These methods \textit{dynamically adjust the activation times of sensors to optimize the detection process across each sub-region}. Regrettably, formulating an activation strategy generally based on historical observations and geographic characteristics, which make the methods and resultant models were neither simple nor practical. Worse still, the complex technical design may ultimately lead to a model with weak generalizability. In this paper, we introduce for the first time the concept of spatio-temporal data dynamic sparse training and are committed to adaptively, dynamically filtering important sensor distributions. To our knowledge, this is the \textbf{first} proposal (\textit{termed DynST}) of an \textbf{industry-level} deployment optimization concept at the data level. However, due to the existence of the temporal dimension, pruning of spatio-temporal data may lead to conflicts at different timestamps. To achieve this goal, we employ dynamic merge technology, along with ingenious dimensional mapping to mitigate potential impacts caused by the temporal aspect. During the training process, DynST utilize iterative pruning and sparse training, repeatedly identifying and dynamically removing sensor perception areas that contribute the least to future predictions.
- Anonymous. 2023. Spatio-temporal Twins with A Cache for Modeling Long-term System Dynamics. In Submitted to The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=aE6HazMgRz under review.
- Anonymous. 2024. NuwaDynamics: Discovering and Updating in Causal Spatio-Temporal Modeling. In The Twelfth International Conference on Learning Representations. https://openreview.net/forum?id=sLdVl0q68X
- Vivit: A video vision transformer. In Proceedings of the IEEE/CVF international conference on computer vision. 6836–6846.
- Rainformer: Features extraction balanced network for radar-based precipitation nowcasting. IEEE Geoscience and Remote Sensing Letters 19 (2022), 1–5.
- Accurate medium-range global weather forecasting with 3D neural networks. Nature 619, 7970 (2023), 533–538.
- Fastgcn: fast learning with graph convolutional networks via importance sampling. arXiv preprint arXiv:1801.10247 (2018).
- A unified lottery ticket hypothesis for graph neural networks. In International Conference on Machine Learning. PMLR, 1695–1706.
- An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale. In International Conference on Learning Representations. https://openreview.net/forum?id=YicbFdNTTy
- Provable and practical approximations for the degree distribution using sublinear graph samples. In Proceedings of the 2018 World Wide Web Conference. 449–458.
- Rigging the lottery: Making all tickets winners. In International Conference on Machine Learning. PMLR, 2943–2952.
- Jonathan Frankle and Michael Carbin. 2018. The lottery ticket hypothesis: Finding sparse, trainable neural networks. arXiv preprint arXiv:1803.03635 (2018).
- Pruning neural networks at initialization: Why are we missing the mark? arXiv preprint arXiv:2009.08576 (2020).
- Hongyang Gao and Shuiwang Ji. 2019. Graph u-nets. In international conference on machine learning. PMLR, 2083–2092.
- Earthformer: Exploring space-time transformers for earth system forecasting. Advances in Neural Information Processing Systems 35 (2022), 25390–25403.
- Simvp: Simpler yet better video prediction. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3170–3180.
- STGCN: a spatial-temporal aware graph learning method for POI recommendation. In 2020 IEEE International Conference on Data Mining (ICDM). IEEE, 1052–1057.
- Dynamic sparse training via balancing the exploration-exploitation trade-off. In 2023 60th ACM/IEEE Design Automation Conference (DAC). IEEE, 1–6.
- Eagle: Large-scale learning of turbulent fluid dynamics with mesh transformers. arXiv preprint arXiv:2302.10803 (2023).
- Spatio-temporal self-supervised learning for traffic flow prediction. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 37. 4356–4364.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
- Srabani Kundu and Nabanita Das. 2023. A study on boundary detection in wireless sensor networks. Innovations in Systems and Software Engineering 19, 2 (2023), 217–225.
- Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926 (2017).
- Fourier neural operator for parametric partial differential equations. arXiv preprint arXiv:2010.08895 (2020).
- Conditional local convolution for spatio-temporal meteorological forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 7470–7478.
- Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers. arXiv preprint arXiv:2005.06870 (2020).
- Do we actually need dense over-parameterization? in-time over-parameterization in sparse training. In International Conference on Machine Learning. PMLR, 6989–7000.
- HOPE: High-order graph ODE for modeling interacting dynamics. In International Conference on Machine Learning. PMLR, 23124–23139.
- Sanity checks for lottery tickets: Does your winning ticket really win the jackpot? Advances in Neural Information Processing Systems 34 (2021), 12749–12760.
- Urban traffic prediction from spatio-temporal data using deep meta learning. In Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining. 1720–1730.
- Fourcastnet: A global data-driven high-resolution weather model using adaptive fourier neural operators. arXiv preprint arXiv:2202.11214 (2022).
- Wireless sensor networks deployment: a result oriented analysis. Wireless Personal Communications 113 (2020), 843–866.
- Vision transformers for dense prediction. 12179–12188.
- Asap: Adaptive structure aware pooling for learning hierarchical graph representations. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34. 5470–5477.
- WeatherBench: a benchmark data set for data-driven weather forecasting. Journal of Advances in Modeling Earth Systems 12, 11 (2020), e2020MS002203.
- Gender differences in grant submissions across science and engineering fields at the NSF. Bioscience 70, 9 (2020), 814–820.
- E (n) equivariant graph neural networks. In International conference on machine learning. PMLR, 9323–9332.
- The graph neural network model. IEEE transactions on neural networks 20, 1 (2008), 61–80.
- Spatial-temporal identity: A simple yet effective baseline for multivariate time series forecasting. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 4454–4458.
- Convolutional LSTM network: A machine learning approach for precipitation nowcasting. Advances in neural information processing systems 28 (2015).
- Simvp: Towards simple yet powerful spatiotemporal predictive learning. arXiv preprint arXiv:2211.12509 (2022).
- Temporal attention unit: Towards efficient spatiotemporal predictive learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18770–18782.
- Attention-based graph neural network for semi-supervised learning. arXiv preprint arXiv:1803.03735 (2018).
- Expanding the FDS simulation capabilities to fire tunnel scenarios through a novel multi-scale model. Fire Technology 57 (2021), 2491–2514.
- Brave the Wind and the Waves: Discovering Robust and Generalizable Graph Lottery Tickets. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
- Searching Lottery Tickets in Graph Neural Networks: A Dual Perspective. In The Eleventh International Conference on Learning Representations.
- Pre-dRNN++: Towards a resolution of the deep-in-time dilemma in spatiotemporal predictive learning. In International Conference on Machine Learning. 5123–5132.
- Eidetic 3D LSTM: A model for video prediction and beyond. In International conference on learning representations.
- Predrnn: Recurrent neural networks for predictive learning using spatiotemporal lstms. Advances in neural information processing systems 30 (2017).
- Memory in memory: A predictive neural network for learning higher-order non-stationarity from spatiotemporal dynamics. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9154–9162.
- Solving High-Dimensional PDEs with Latent Spectral Models. arXiv preprint arXiv:2301.12664 (2023).
- Earthfarseer: Versatile Spatio-Temporal Dynamical Systems Modeling in One Model. arXiv preprint arXiv:2312.08403 (2023).
- PastNet: Introducing Physical Inductive Biases for Spatio-temporal Video Prediction. arXiv preprint arXiv:2305.11421 (2023).
- A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems 32, 1 (2020), 4–24.
- Sheng Xu. 2020. Optimal sensor placement for target localization using hybrid RSS, AOA and TOA measurements. IEEE Communications Letters 24, 9 (2020), 1966–1970.
- Huan Yan and Yong Li. 2023. A Survey of Generative AI for Intelligent Transportation Systems. arXiv preprint arXiv:2312.08248 (2023).
- Ramin Yarinezhad and Seyed Naser Hashemi. 2023. A sensor deployment approach for target coverage problem in wireless sensor networks. Journal of Ambient Intelligence and Humanized Computing 14, 5 (2023), 5941–5956.
- Position-aware graph neural networks. In International conference on machine learning. PMLR, 7134–7143.
- Representative graph neural network. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VII 16. Springer, 379–396.
- Demand-Driven Urban Facility Visit Prediction. ACM Transactions on Intelligent Systems and Technology (2023).
- Skilful nowcasting of extreme precipitation with NowcastNet. Nature 619, 7970 (2023), 526–532.
- Dynamic Sparse No Training: Training-Free Fine-tuning for Sparse LLMs. arXiv preprint arXiv:2310.08915 (2023).
- Hierarchical multi-view graph pooling with structure learning. IEEE Transactions on Knowledge and Data Engineering (2021).
- Spatial planning of urban communities via deep reinforcement learning. Nature Computational Science 3, 9 (2023), 748–762.
- Yao Zou and Krishnendu Chakrabarty. 2003. Sensor deployment and target localization based on virtual forces. In IEEE INFOCOM 2003. Twenty-second Annual Joint Conference of the IEEE Computer and Communications Societies (IEEE Cat. No. 03CH37428), Vol. 2. IEEE, 1293–1303.