Graph Out-of-Distribution Generalization via Causal Intervention (2402.11494v2)
Abstract: Out-of-distribution (OOD) generalization has gained increasing attentions for learning on graphs, as graph neural networks (GNNs) often exhibit performance degradation with distribution shifts. The challenge is that distribution shifts on graphs involve intricate interconnections between nodes, and the environment labels are often absent in data. In this paper, we adopt a bottom-up data-generative perspective and reveal a key observation through causal analysis: the crux of GNNs' failure in OOD generalization lies in the latent confounding bias from the environment. The latter misguides the model to leverage environment-sensitive correlations between ego-graph features and target nodes' labels, resulting in undesirable generalization on new unseen nodes. Built upon this analysis, we introduce a conceptually simple yet principled approach for training robust GNNs under node-level distribution shifts, without prior knowledge of environment labels. Our method resorts to a new learning objective derived from causal inference that coordinates an environment estimator and a mixture-of-expert GNN predictor. The new approach can counteract the confounding bias in training data and facilitate learning generalizable predictive relations. Extensive experiment demonstrates that our model can effectively enhance generalization with various types of distribution shifts and yield up to 27.4\% accuracy improvement over state-of-the-arts on graph OOD generalization benchmarks. Source codes are available at https://github.com/fannie1208/CaNet.
- Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019).
- Evaluating Robustness and Uncertainty of Graph Models Under Structural Distributional Shifts. arXiv preprint arXiv:2302.13875 (2023).
- Size-Invariant Graph Representations for Graph Classification Extrapolations. In International Conference on Machine Learning (ICML). 837–851.
- SizeShiftReg: a Regularization Method for Improving Size-Generalization in Graph Neural Networks. In NeurIPS.
- Peter Bühlmann. 2018. Invariance, causality and robustness. CoRR abs/1812.08233 (2018).
- Simple and Deep Graph Convolutional Networks. In ICML. 1725–1735.
- Generalizing Graph Neural Networks on Out-Of-Distribution Graphs. arXiv preprint arXiv:2111.10657 (2021).
- Domain-Adversarial Training of Neural Networks. J. Mach. Learn. Res. 17 (2016), 59:1–59:35.
- Domain Adaptation with Conditional Transferable Components. In International Conference on Machine Learning (ICML). 2839–2848.
- A new model for learning in graph domains. IEEE Trans. Neural Networks 2, 1 (2005), 729–734.
- Attention Based Spatial-Temporal Graph Convolutional Networks for Traffic Flow Forecasting. In AAAI. 922–929.
- Inductive Representation Learning on Large Graphs. In NeurIPS. 1024–1034.
- Open Graph Benchmark: Datasets for Machine Learning on Graphs. In NeurIPS 2020, Hugo Larochelle, Marc’Aurelio Ranzato, Raia Hadsell, Maria-Florina Balcan, and Hsuan-Tien Lin (Eds.).
- Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In ICLR.
- Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In ICLR.
- WILDS: A Benchmark of in-the-Wild Distribution Shifts. In International Conference on Machine Learning (ICML). 5637–5664.
- Out-of-distribution generalization via risk extrapolation (rex). In ICML.
- Ood-gnn: Out-of-distribution generalized graph neural network. TKDE (2022).
- GraphDE: A Generative Framework for Debiased Learning and Out-of-Distribution Detection on Graphs. In Advances in Neural Information Processing Systems.
- Structural Re-weighting Improves Graph Domain Adaptation. In International Conference on Machine Learning.
- FLOOD: A Flexible Invariant Learning Framework for Out-of-Distribution Generalization on Graphs. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1548–1558.
- Subgroup generalization and fairness of graph neural networks. Advances in Neural Information Processing Systems 34 (2021), 1048–1061.
- The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables. In ICLR.
- EvolveGCN: Evolving Graph Convolutional Networks for Dynamic Graphs. In AAAI Conference on Artificial Intelligence (AAAI). 5363–5370.
- Causal inference in statistics: A primer. John Wiley & Sons (2016).
- Invariant Models for Causal Transfer Learning. Journal of Machine Learning Research 19 (2018), 36:1–36:34.
- Benedek Rozemberczki and Rik Sarkar. 2021. Twitch Gamers: a Dataset for Evaluating Proximity Preserving and Structural Role-based Node Embeddings. arXiv:2101.03091 [cs.SI]
- Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019).
- The Graph Neural Network Model. IEEE Trans. Neural Networks 20, 1 (2009), 61–80.
- Collective classification in network data. AI magazine 29, 3 (2008), 93–93.
- Unleashing the power of graph data augmentation on covariate distribution shift. Advances in Neural Information Processing Systems 36 (2024).
- Baochen Sun and Kate Saenko. 2016. Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops. 443–450.
- Social influence analysis in large-scale networks. In KDD. ACM, 807–816.
- Jakub Tomczak and Max Welling. 2018. VAE with a VampPrior. In AISTATS.
- Graph Attention Networks. In ICLR.
- Simplifying Graph Convolutional Networks. In ICML, Kamalika Chaudhuri and Ruslan Salakhutdinov (Eds.). 6861–6871.
- Energy-based Out-of-Distribution Detection for Graph Neural Networks. In International Conference on Learning Representations.
- Dual Graph Attention Networks for Deep Latent Representation of Multifaceted Social Effects in Recommender Systems. In The World Wide Web Conference. 2091–2102.
- Handling Distribution Shifts on Graphs: An Invariance Perspective. In International Conference on Learning Representations.
- NodeFormer: A Scalable Graph Structure Learning Transformer for Node Classification. In Advances in Neural Information Processing Systems.
- SGFormer: Simplifying and Empowering Transformers for Large-Graph Representations. In Advances in Neural Information Processing Systems (NeurIPS).
- Graph Neural Networks are Inherently Good Generalizers: Insights by Bridging GNNs and MLPs. In International Conference on Learning Representations.
- Learning Substructure Invariance for Out-of-Distribution Molecular Representations. In Advances in Neural Information Processing Systems.
- MoleRec: Combinatorial Drug Recommendation with Substructure-Aware Molecular Representation Learning. In The Web Conference. 4075–4085.
- From Local Structures to Size Generalization in Graph Neural Networks. In International Conference on Machine Learning (ICML). 11975–11986.
- Fast and Accurate Anomaly Detection in Dynamic Graphs with a Two-Pronged Approach. In ACM SIGKDD International Conference on Knowledge Discovery. 647–657.
- Mind the Label Shift of Augmentation-based Graph OOD Generalization. In IEEE/CVF Conference on Computer Vision and Pattern Recognition.
- mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017).
- Learning High-Order Graph Convolutional Networks via Adaptive Layerwise Aggregation Combination. IEEE Trans. Neural Networks Learn. Syst. 34, 8 (2023), 5144–5155.
- Shift-robust gnns: Overcoming the limitations of localized graph training data. NeurIPS (2021), 27965–27977.
- Marinka Zitnik and Jure Leskovec. 2017. Predicting multicellular function through multi-layer tissue networks. Bioinform. 33, 14 (2017), i190–i198. https://doi.org/10.1093/bioinformatics/btx252
- Qitian Wu (29 papers)
- Fan Nie (13 papers)
- Chenxiao Yang (16 papers)
- Tianyi Bao (2 papers)
- Junchi Yan (241 papers)