Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Causality-Aware Spatiotemporal Graph Neural Networks for Spatiotemporal Time Series Imputation (2403.11960v4)

Published 18 Mar 2024 in cs.LG and stat.ML

Abstract: Spatiotemporal time series are usually collected via monitoring sensors placed at different locations, which usually contain missing values due to various failures, such as mechanical damages and Internet outages. Imputing the missing values is crucial for analyzing time series. When recovering a specific data point, most existing methods consider all the information relevant to that point regardless of the cause-and-effect relationship. During data collection, it is inevitable that some unknown confounders are included, e.g., background noise in time series and non-causal shortcut edges in the constructed sensor network. These confounders could open backdoor paths and establish non-causal correlations between the input and output. Over-exploiting these non-causal correlations could cause overfitting. In this paper, we first revisit spatiotemporal time series imputation from a causal perspective and show how to block the confounders via the frontdoor adjustment. Based on the results of frontdoor adjustment, we introduce a novel Causality-Aware Spatiotemporal Graph Neural Network (Casper), which contains a novel Prompt Based Decoder (PBD) and a Spatiotemporal Causal Attention (SCA). PBD could reduce the impact of confounders and SCA could discover the sparse causal relationships among embeddings. Theoretical analysis reveals that SCA discovers causal relationships based on the values of gradients. We evaluate Casper on three real-world datasets, and the experimental results show that Casper could outperform the baselines and could effectively discover causal relationships.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (56)
  1. Counterfactual vision and language learning. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10044–10054.
  2. Juan Lopez Alcaraz and Nils Strodthoff. 2022. Diffusion-based Time Series Imputation and Forecasting with Structured State Space Models. Transactions on Machine Learning Research (2022).
  3. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
  4. A Meta-Transfer Objective for Learning to Disentangle Causal Mechanisms. In International Conference on Learning Representations.
  5. Brits: Bidirectional recurrent imputation for time series. Advances in neural information processing systems 31 (2018).
  6. Recurrent neural networks for multivariate time series with missing values. Scientific reports 8, 1 (2018), 6085.
  7. CUTS: Neural Causal Discovery from Irregular Time-Series Data. In The Eleventh International Conference on Learning Representations.
  8. Andrea Cini and Ivan Marisca. 2022. Torch Spatiotemporal. https://github.com/TorchSpatiotemporal/tsl
  9. Filling the G_ap_s: Multivariate Time Series Imputation by Graph Neural Networks. In International Conference on Learning Representations.
  10. James Durbin and Siem Jan Koopman. 2012. Time series analysis by state space methods. Vol. 38. OUP Oxford.
  11. Should graph convolution trust neighbors? a simple causal inference method. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 1208–1218.
  12. Matthias Fey and Jan E. Lenssen. 2019. Fast Graph Representation Learning with PyTorch Geometric. In ICLR Workshop on Representation Learning on Graphs and Manifolds.
  13. Generative adversarial nets. Advances in neural information processing systems 27 (2014).
  14. Clive WJ Granger. 1969. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: journal of the Econometric Society (1969), 424–438.
  15. Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735–1780.
  16. Timothy O Hodson. 2022. Root-mean-square error (RMSE) or mean absolute error (MAE): When to use them or not. Geoscientific Model Development 15, 14 (2022), 5481–5487.
  17. Distilling causal effect of data in class-incremental learning. In Proceedings of the IEEE/CVF conference on Computer Vision and Pattern Recognition. 3957–3966.
  18. Learning continuous system dynamics from irregularly-sampled partial observations. Advances in Neural Information Processing Systems 33 (2020), 16177–16187.
  19. Categorical Reparameterization with Gumbel-Softmax. In International Conference on Learning Representations.
  20. Visual prompt tuning. In European Conference on Computer Vision. Springer, 709–727.
  21. Network of tensor time series. In Proceedings of the Web Conference 2021. 2425–2437.
  22. Saurabh Khanna and Vincent YF Tan. 2019. Economy Statistical Recurrent Units For Inferring Nonlinear Granger Causality. In International Conference on Learning Representations.
  23. Variational autoencoders and nonlinear ica: A unifying framework. In International Conference on Artificial Intelligence and Statistics. PMLR, 2207–2217.
  24. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
  25. CausalGAN: Learning Causal Implicit Generative Models with Adversarial Training. In International Conference on Learning Representations.
  26. Diffusion Convolutional Recurrent Neural Network: Data-Driven Traffic Forecasting. In International Conference on Learning Representations.
  27. Naomi: Non-autoregressive multiresolution sequence imputation. Advances in neural information processing systems 32 (2019).
  28. Multivariate time series imputation with generative adversarial networks. Advances in neural information processing systems 31 (2018).
  29. E2gan: End-to-end generative adversarial network for multivariate time series imputation. In Proceedings of the 28th international joint conference on artificial intelligence. AAAI Press Palo Alto, CA, USA, 3094–3100.
  30. A Look into Causal Effects under Entangled Treatment in Graphs: Investigating the Impact of Contact on MRSA Infection. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4584–4594.
  31. Clear: Generative counterfactual explanations on graphs. Advances in Neural Information Processing Systems 35 (2022), 25895–25907.
  32. Learning causal effects on hypergraphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1202–1212.
  33. Learning to reconstruct missing data from spatiotemporal graphs with sparse observations. Advances in Neural Information Processing Systems 35 (2022), 32069–32082.
  34. Generative semi-supervised learning for multivariate time series imputation. In Proceedings of the AAAI conference on artificial intelligence, Vol. 35. 8983–8991.
  35. Kevin P Murphy. 2012. Machine learning: a probabilistic perspective. MIT press.
  36. The statistical recurrent unit. In International Conference on Machine Learning. PMLR, 2671–2680.
  37. Judea Pearl and Dana Mackenzie. 2018. The book of why: the new science of cause and effect. Basic books.
  38. Two causal principles for improving visual dialog. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10860–10869.
  39. Latent ordinary differential equations for irregularly-sampled time series. Advances in neural information processing systems 32 (2019).
  40. Donald B Rubin. 2005. Causal inference using potential outcomes: Design, modeling, decisions. J. Amer. Statist. Assoc. 100, 469 (2005), 322–331.
  41. Grad-cam: Visual explanations from deep networks via gradient-based localization. In Proceedings of the IEEE international conference on computer vision. 618–626.
  42. Nrtsi: Non-recurrent time series imputation. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
  43. Satya Narayan Shukla and Benjamin Marlin. 2020. Multi-Time Attention Networks for Irregularly Sampled Time Series. In International Conference on Learning Representations.
  44. Causal attention for interpretable and generalizable graph classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1696–1705.
  45. Neural granger causality. IEEE Transactions on Pattern Analysis and Machine Intelligence 44, 8 (2021), 4267–4279.
  46. Csdi: Conditional score-based diffusion models for probabilistic time series imputation. Advances in Neural Information Processing Systems 34 (2021), 24804–24816.
  47. Attention is all you need. Advances in neural information processing systems 30 (2017).
  48. Networked time series imputation via position-aware graph enhanced variational autoencoders. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 2256–2268.
  49. Multiple imputation using chained equations: issues and guidance for practice. Statistics in medicine 30, 4 (2011), 377–399.
  50. Granger causal inference on DAGs identifies genomic loci regulating transcription. In International Conference on Learning Representations.
  51. Causal attention for vision-language tasks. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 9847–9857.
  52. ST-MVL: filling missing values in geo-sensory time series data. In Proceedings of the 25th International Joint Conference on Artificial Intelligence.
  53. Multivariate time series imputation with transformers. IEEE Signal Processing Letters 29 (2022), 2517–2521.
  54. Temporal regularized matrix factorization for high-dimensional time series prediction. Advances in neural information processing systems 29 (2016).
  55. Counterfactual zero-shot and open-set visual recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 15404–15414.
  56. Forecasting fine-grained air quality based on big data. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining. 2267–2276.
Citations (3)

Summary

  • The paper introduces a novel causality-aware GNN model that leverages spatiotemporal causal attention to effectively identify and utilize genuine cause-effect relationships in sensor data.
  • It employs a prompt-based decoder to incorporate global contextual information while mitigating the adverse effects of confounders on imputation accuracy.
  • Experimental results on real-world datasets demonstrate superior performance in reducing MAE and MSE compared to traditional deep learning approaches.

Exploring the Causality in Spatiotemporal Time Series Imputation with Graph Neural Networks

Introduction

Spatiotemporal time series data, obtained from sensor networks monitoring various phenomena, often suffer from missing values due to sensor malfunctions or other disruptions. The imputation of these missing values is crucial for subsequent data analysis and decision-making processes. Traditional methods and most existing deep learning approaches do not differentiate between causal and non-causal relationships when attempting imputation, potentially leveraging spurious correlations introduced by confounders.

In addressing these challenges, Jing et al. propose a novel approach titled Causality-Aware Spatiotemporal Graph Neural Network (). This method is grounded in a causal perspective, identifying and leveraging the cause-and-effect relationships intrinsic to spatiotemporal data. The model incorporates a Spatiotemporal Causal Attention (SCA) mechanism and a Prompt Based Decoder (PBD), providing a robust framework against confounders and emphasizing causal relationships for imputation.

Methodology

** revisits the spatiotemporal time series imputation problem through a causal lens, explicitly modeling the interactions between input, output, embeddings, and confounders using the Structure Causal Model (SCM). The work highlights the detrimental role of confounders in creating spurious correlations and addresses them via the frontdoor adjustment, effectively disentangling causal relationships from non-causal correlations.

The architecture of comprises two main components:

  • Spatiotemporal Causal Attention (SCA): This mechanism discovers and utilizes sparse causal relationships among time series embeddings, fundamentally based on gradients, which inherently filters out the non-causal correlations.
  • Prompt Based Decoder (PBD): Contrary to directly approximating the entire context for imputation, PBD employs learnable prompts to encapsulate the dataset's global contextual information, effectively mitigating the influence of confounders.

Theoretical Insights

The paper provides a solid theoretical foundation for the SCA mechanism's ability to discern causal from non-causal relationships, relying on gradients' values. This approach not only simplifies the interpretation of causal relations but also enhances the model's focus on genuinely influential data points, thus improving imputation accuracy and model robustness.

Experimental Evaluation

Extensively evaluated on three real-world datasets, showcases superior performance over existing baselines in terms of MAE and MSE metrics. These strong numerical results underline 's efficacy in leveraging causal relationships for imputation tasks, even in the presence of confounders.

Future Directions

The introduction of causality into the imputation of spatiotemporal time series opens new avenues for research, including the potential for discovering more complex causal mechanisms within sensor networks and extending these concepts to other domains where cause-and-effect relationships play a crucial role. Furthermore, the integration of causality could provide a new paradigm for designing more robust and interpretable machine learning models across various applications.

Conclusion

addresses the critical issue of confounders in spatiotemporal time series imputation by innovatively applying causality theory. Through its causality-aware architecture, it not only achieves superior imputation performance but also provides a pathway toward understanding the underlying cause-and-effect relationships in sensor network data. This work represents a significant step forward in the integration of causality with graph neural networks, offering insights that could transform future approaches in spatiotemporal data analysis and beyond.