Deciphering Spatio-Temporal Graph Forecasting: A Causal Lens and Treatment (2309.13378v1)
Abstract: Spatio-Temporal Graph (STG) forecasting is a fundamental task in many real-world applications. Spatio-Temporal Graph Neural Networks have emerged as the most popular method for STG forecasting, but they often struggle with temporal out-of-distribution (OoD) issues and dynamic spatial causation. In this paper, we propose a novel framework called CaST to tackle these two challenges via causal treatments. Concretely, leveraging a causal lens, we first build a structural causal model to decipher the data generation process of STGs. To handle the temporal OoD issue, we employ the back-door adjustment by a novel disentanglement block to separate invariant parts and temporal environments from input data. Moreover, we utilize the front-door adjustment and adopt the Hodge-Laplacian operator for edge-level convolution to model the ripple effect of causation. Experiments results on three real-world datasets demonstrate the effectiveness and practicality of CaST, which consistently outperforms existing methods with good interpretability.
- Adaptive graph convolutional recurrent network for traffic forecasting. Advances in neural information processing systems 33 (2020), 17804–17815.
- An empirical evaluation of generic convolutional and recurrent networks for sequence modeling. arXiv preprint arXiv:1803.01271 (2018).
- Mutual information neural estimation. In International conference on machine learning. PMLR, 531–540.
- Graph neural controlled differential equations for traffic forecasting. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 6367–6374.
- Adarnn: Adaptive learning and forecasting of time series. In Proceedings of the 30th ACM international conference on information & knowledge management. 402–411.
- Topological persistence and simplification. In Proceedings 41st annual symposium on foundations of computer science. IEEE, 454–463.
- Spatial-temporal graph ode networks for traffic flow forecasting. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 364–373.
- Predicting human mobility with semantic motivation via multi-task attentional recurrent networks. IEEE Transactions on Knowledge and Data Engineering 34, 5 (2020), 2360–2374.
- Causal inference in recommender systems: A survey and future directions. arXiv preprint arXiv:2208.12397 (2022).
- Spatiotemporal multi-graph convolution network for ride-hailing demand forecasting. In AAAI, Vol. 33. 3656–3663.
- Causal inference in statistics: A primer. John Wiley & Sons.
- Alex Graves. 2013. Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850 (2013).
- Attention based spatial-temporal graph convolutional networks for traffic flow forecasting. In Proceedings of the AAAI conference on artificial intelligence, Vol. 33. 922–929.
- Dynamic and multi-faceted spatio-temporal deep learning for traffic speed forecasting. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining. 547–555.
- Heterogeneous Graph Convolutional Neural Network via Hodge-Laplacian for Brain Functional Data. arXiv preprint arXiv:2302.09323 (2023).
- STDEN: Towards physics-guided neural networks for traffic flow prediction. In AAAI, Vol. 36. 4048–4056.
- Dl-traff: Survey and benchmark of deep learning models for urban traffic prediction. In Proceedings of the 30th ACM international conference on information & knowledge management. 4515–4525.
- CensNet: Convolution with Edge-Node Switching in Graph Neural Networks.. In IJCAI. 2656–2662.
- Spatio-Temporal Graph Neural Networks for Predictive Learning in Urban Computing: A Survey. arXiv preprint arXiv:2303.14483 (2023).
- Edge representation learning with hypergraphs. Advances in Neural Information Processing Systems 34 (2021), 7534–7546.
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Thomas N Kipf and Max Welling. [n. d.]. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations.
- Hole detection in metabolic connectivity of Alzheimer’s disease using k- Laplacian. In MICCAI. Springer, 297–304.
- Ood-gnn: Out-of-distribution generalized graph neural network. IEEE Transactions on Knowledge and Data Engineering (2022).
- Diffusion convolutional recurrent neural network: Data-driven traffic forecasting. arXiv preprint arXiv:1707.01926 (2017).
- AirFormer: Predicting Nationwide Air Quality in China with Transformers. arXiv preprint arXiv:2211.15979 (2022).
- A causal inference look at unsupervised video anomaly detection. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36. 1620–1629.
- Msdr: Multi-step dependency relation networks for spatial temporal forecasting. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1042–1050.
- Do We Really Need Graph Neural Networks for Traffic Forecasting? arXiv preprint arXiv:2301.12603 (2023).
- Meinard Müller. 2007. Dynamic time warping. Information retrieval for music and motion (2007), 69–84.
- Judea Pearl et al. 2000. Models, reasoning and inference. Cambridge, UK: CambridgeUniversityPress 19, 2 (2000).
- Adjusting for confounding with text matching. American Journal of Political Science 64, 4 (2020), 887–903.
- Flora Salim and Usman Haque. 2015. Urban computing in the wild: A survey on large scale participation and citizen engagement with ubiquitous computing, cyber physical systems, and Internet of Things. International Journal of Human-Computer Studies 81 (2015), 31–48.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE signal processing magazine 30, 3 (2013), 83–98.
- Spatial-Temporal Synchronous Graph Convolutional Networks: A New Framework for Spatial-Temporal Network Data Forecasting. AAAI 34, 1 (2020), 914–921.
- Causal attention for interpretable and generalizable graph classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1696–1705.
- Debiasing nlu models via causal intervention and counterfactual reasoning. In AAAI, Vol. 36. 11376–11384.
- Hiroyuki Toda. 1991. Vector autoregression and causality. Yale University.
- Neural discrete representation learning. Advances in neural information processing systems 30 (2017).
- Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
- Modeling inter-station relationships with attentive temporal graph convolutional network for air quality prediction. In WSDM. 616–634.
- Libcity: An open library for traffic prediction. In Proceedings of the 29th International Conference on Advances in Geographic Information Systems. 145–148.
- Deep learning for spatio-temporal data mining: A survey. IEEE transactions on knowledge and data engineering (2020).
- Visual commonsense r-cnn. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10760–10770.
- CoST: Contrastive Learning of Disentangled Seasonal-Trend Representations for Time Series Forecasting. In International Conference on Learning Representations.
- Deconfounding to explanation evaluation in graph neural networks. arXiv preprint arXiv:2201.08802 (2022).
- Connecting the dots: Multivariate time series forecasting with graph neural networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 753–763.
- Graph WaveNet for Deep Spatial-Temporal Graph Modeling. In IJCAI. 1907–1913.
- Towards out-of-distribution sequential event prediction: A causal treatment. arXiv preprint arXiv:2210.13005 (2022).
- Revisiting spatial-temporal similarity: A deep learning framework for traffic prediction. In AAAI, Vol. 33. 5668–5675.
- Deep distributed fusion network for air quality prediction. In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 965–973.
- A literature survey on smart cities. Sci. China Inf. Sci. 58, 10 (2015), 1–18.
- Spatio-temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In IJCAI.
- Causal intervention for weakly-supervised semantic segmentation. Advances in Neural Information Processing Systems 33 (2020), 655–666.
- Deep Spatio-Temporal Residual Networks for Citywide Crowd Flows Prediction. In AAAI. 1655–1661.
- Gman: A graph multi-attention network for traffic prediction. In Proceedings of the AAAI conference on artificial intelligence, Vol. 34. 1234–1241.
- Disentangling user interest and conformity for recommendation with causal embedding. In Proceedings of the Web Conference 2021. 2980–2991.
- OOD Link Prediction Generalization Capabilities of Message-Passing GNNs in Larger Test Graphs. In Advances in Neural Information Processing Systems.