Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series Imputation (2403.11960v4)
Abstract: Spatiotemporal time series are usually collected via monitoring sensors placed at different locations, which usually contain missing values due to various failures, such as mechanical damages and Internet outages. Imputing the missing values is crucial for analyzing time series. When recovering a specific data point, most existing methods consider all the information relevant to that point regardless of the cause-and-effect relationship. During data collection, it is inevitable that some unknown confounders are included, e.g., background noise in time series and non-causal shortcut edges in the constructed sensor network. These confounders could open backdoor paths and establish non-causal correlations between the input and output. Over-exploiting these non-causal correlations could cause overfitting. In this paper, we first revisit spatiotemporal time series imputation from a causal perspective and show how to block the confounders via the frontdoor adjustment. Based on the results of frontdoor adjustment, we introduce a novel Causality-Aware Spatiotemporal Graph Neural Network (Casper), which contains a novel Prompt Based Decoder (PBD) and a Spatiotemporal Causal Attention (SCA). PBD could reduce the impact of confounders and SCA could discover the sparse causal relationships among embeddings. Theoretical analysis reveals that SCA discovers causal relationships based on the values of gradients. We evaluate Casper on three real-world datasets, and the experimental results show that Casper could outperform the baselines and could effectively discover causal relationships.
- Counterfactual vision and language learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10044–10054.
- Juan Lopez Alcaraz and Nils Strodthoff. 2022. Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models. Transactions on Machine Learning Research (2022).
- Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
- A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms. In International Conference on Learning Representations.
- Brits: Bidirectional recurrent imputation for time series. Advances in neural information processing systems 31 (2018).
- Recurrent neural networks for multivariate time series with missing values. Scientific reports 8, 1 (2018), 6085.
- CUTS: Neural Causal Discovery from Irregular Time-Series Data. In The Eleventh International Conference on Learning Representations.
- Andrea Cini and Ivan Marisca. 2022. Torch Spatiotemporal. https://github.com/TorchSpatiotemporal/tsl
- Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks. In International Conference on Learning Representations.
- James Durbin and Siem Jan Koopman. 2012. Time series analysis by state space methods. Vol. 38. OUP Oxford.
- Should graph convolution trust neighbors? a simple causal inference method. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1208–1218.
- Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.
- Generative adversarial nets. Advances in neural information processing systems 27 (2014).
- Clive WJ Granger. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society (1969), 424–438.
- Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
- Timothy O Hodson. 2022. Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development 15, 14 (2022), 5481–5487.
- Distilling causal effect of data in class-incremental learning. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 3957–3966.
- Learning continuous system dynamics from irregularly-sampled partial observations. Advances in Neural Information Processing Systems 33 (2020), 16177–16187.
- Categorical Reparameterization with Gumbel-Softmax. In International Conference on Learning Representations.
- Visual prompt tuning. In European Conference on Computer Vision. Springer, 709–727.
- Network of tensor time series. In Proceedings of the Web Conference 2021. 2425–2437.
- Saurabh Khanna and Vincent YF Tan. 2019. Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality. In International Conference on Learning Representations.
- Variational autoencoders and nonlinear ica: A unifying framework. In International Conference on Artificial Intelligence and Statistics. PMLR, 2207–2217.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training. In International Conference on Learning Representations.
- Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In International Conference on Learning Representations.
- Naomi: Non-autoregressive multiresolution sequence imputation. Advances in neural information processing systems 32 (2019).
- Multivariate time series imputation with generative adversarial networks. Advances in neural information processing systems 31 (2018).
- E2gan: End-to-end generative adversarial network for multivariate time series imputation. In Proceedings of the 28th international joint conference on artificial intelligence. AAAI Press Palo Alto, CA, USA, 3094–3100.
- A Look into Causal Effects under Entangled Treatment in Graphs: Investigating the Impact of Contact on MRSA Infection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4584–4594.
- Clear: Generative counterfactual explanations on graphs. Advances in Neural Information Processing Systems 35 (2022), 25895–25907.
- Learning causal effects on hypergraphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1202–1212.
- Learning to reconstruct missing data from spatiotemporal graphs with sparse observations. Advances in Neural Information Processing Systems 35 (2022), 32069–32082.
- Generative semi-supervised learning for multivariate time series imputation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 8983–8991.
- Kevin P Murphy. 2012. Machine learning: a probabilistic perspective. MIT press.
- The statistical recurrent unit. In International Conference on Machine Learning. PMLR, 2671–2680.
- Judea Pearl and Dana Mackenzie. 2018. The book of why: the new science of cause and effect. Basic books.
- Two causal principles for improving visual dialog. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10860–10869.
- Latent ordinary differential equations for irregularly-sampled time series. Advances in neural information processing systems 32 (2019).
- Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100, 469 (2005), 322–331.
- Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
- Nrtsi: Non-recurrent time series imputation. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
- Satya Narayan Shukla and Benjamin Marlin. 2020. Multi-Time Attention Networks for Irregularly Sampled Time Series. In International Conference on Learning Representations.
- Causal attention for interpretable and generalizable graph classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1696–1705.
- Neural granger causality. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 8 (2021), 4267–4279.
- Csdi: Conditional score-based diffusion models for probabilistic time series imputation. Advances in Neural Information Processing Systems 34 (2021), 24804–24816.
- Attention is all you need. Advances in neural information processing systems 30 (2017).
- Networked time series imputation via position-aware graph enhanced variational autoencoders. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2256–2268.
- Multiple imputation using chained equations: issues and guidance for practice. Statistics in medicine 30, 4 (2011), 377–399.
- Granger causal inference on DAGs identifies genomic loci regulating transcription. In International Conference on Learning Representations.
- Causal attention for vision-language tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9847–9857.
- ST-MVL: filling missing values in geo-sensory time series data. In Proceedings of the 25th International Joint Conference on Artificial Intelligence.
- Multivariate time series imputation with transformers. IEEE Signal Processing Letters 29 (2022), 2517–2521.
- Temporal regularized matrix factorization for high-dimensional time series prediction. Advances in neural information processing systems 29 (2016).
- Counterfactual zero-shot and open-set visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15404–15414.
- Forecasting fine-grained air quality based on big data. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 2267–2276.