Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Cooperative Classification and Rationalization for Graph Generalization (2403.06239v1)

Published 10 Mar 2024 in cs.LG and cs.AI

Abstract: Graph Neural Networks (GNNs) have achieved impressive results in graph classification tasks, but they struggle to generalize effectively when faced with out-of-distribution (OOD) data. Several approaches have been proposed to address this problem. Among them, one solution is to diversify training distributions in vanilla classification by modifying the data environment, yet accessing the environment information is complex. Besides, another promising approach involves rationalization, extracting invariant rationales for predictions. However, extracting rationales is difficult due to limited learning signals, resulting in less accurate rationales and diminished predictions. To address these challenges, in this paper, we propose a Cooperative Classification and Rationalization (C2R) method, consisting of the classification and the rationalization module. Specifically, we first assume that multiple environments are available in the classification module. Then, we introduce diverse training distributions using an environment-conditional generative network, enabling robust graph representations. Meanwhile, the rationalization module employs a separator to identify relevant rationale subgraphs while the remaining non-rationale subgraphs are de-correlated with labels. Next, we align graph representations from the classification module with rationale subgraph representations using the knowledge distillation methods, enhancing the learning signal for rationales. Finally, we infer multiple environments by gathering non-rationale representations and incorporate them into the classification module for cooperative learning. Extensive experimental results on both benchmarks and synthetic datasets demonstrate the effectiveness of C2R. Code is available at https://github.com/yuelinan/Codes-of-C2R.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (50)
  1. Deep Variational Information Bottleneck. In 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings.
  2. Invariant risk minimization. arXiv preprint arXiv:1907.02893 (2019).
  3. Mutual information neural estimation. In International Conference on Machine Learning. PMLR, 531–540.
  4. Size-invariant graph representations for graph classification extrapolations. In International Conference on Machine Learning. PMLR, 837–851.
  5. Sizeshiftreg: a regularization method for improving size-generalization in graph neural networks. Advances in Neural Information Processing Systems 35 (2022), 31871–31885.
  6. Invariant Rationalization. In Proceedings of the 37th International Conference on Machine Learning, (ICML).
  7. Learning causally invariant representations for out-of-distribution generalization on graphs. Advances in Neural Information Processing Systems 35 (2022), 22131–22148.
  8. Attention consistency on visual corruptions for single-source domain generalization. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4165–4174.
  9. Debiasing Graph Neural Networks via Learning Disentangled Causal Substructure. (2022).
  10. Generalizing graph neural networks on out-of-distribution graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
  11. Graph neural networks for social recommendation. In The world wide web conference. 417–426.
  12. An efficient memory data organization strategy for application-characteristic graph processing. Frontiers of Computer Science 16 (2022), 1–3.
  13. Graph random neural networks for semi-supervised learning on graphs. Advances in neural information processing systems 33 (2020), 22092–22103.
  14. Zero-1-to-3: Domain-level Zero-shot Cognitive Diagnosis via One Batch of Early-bird Students towards Three Diagnostic Objectives. arXiv preprint arXiv:2312.13434 (2023).
  15. Leveraging transferable knowledge concept graph embedding for cold-start cognitive diagnosis. In Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval. 983–992.
  16. Few-shot graph learning for molecular property prediction. In Proceedings of the Web Conference 2021. 2559–2567.
  17. John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics) 28, 1 (1979), 100–108.
  18. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118–22133.
  19. Categorical reparametrization with gumble-softmax. In International Conference on Learning Representations (ICLR 2017). OpenReview. net.
  20. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. In Proceedings of the 3rd ICLR.
  21. Thomas N. Kipf and Max Welling. 2017. Semi-Supervised Classification with Graph Convolutional Networks. In International Conference on Learning Representations (ICLR).
  22. Understanding attention and generalization in graph neural networks. Advances in neural information processing systems 32 (2019).
  23. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278–2324.
  24. Ood-gnn: Out-of-distribution generalized graph neural network. IEEE Transactions on Knowledge and Data Engineering (2022).
  25. Out-of-distribution generalization on graphs: A survey. arXiv preprint arXiv:2202.07987 (2022).
  26. Learning invariant graph representations for out-of-distribution generalization. In Advances in Neural Information Processing Systems.
  27. Graph rationalization with environment-based augmentations. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1069–1078.
  28. Local augmentation for graph neural networks. In ICML. PMLR, 14054–14072.
  29. RHGN: Relation-gated Heterogeneous Graph Network for Entity Alignment in Knowledge Graphs. In Findings of ACL.
  30. Interpretable and generalizable graph learning via stochastic attention mechanism. In International Conference on Machine Learning. PMLR, 15524–15543.
  31. Estimating divergence functionals and the likelihood ratio by convex risk minimization. IEEE Transactions on Information Theory 56, 11 (2010), 5847–5861.
  32. Representation learning with contrastive predictive coding. arXiv preprint arXiv:1807.03748 (2018).
  33. Causality-inspired single-source domain generalization for medical image segmentation. IEEE Transactions on Medical Imaging 42, 4 (2022), 1095–1106.
  34. Distributionally robust neural networks for group shifts: On the importance of regularization for worst-case generalization. arXiv preprint arXiv:1911.08731 (2019).
  35. Learning from the Best: Rationalizing Prediction by Adversarial Information Calibration. In Proceedings of the 35th AAAI Conference on Artificial Intelligence (AAAI).
  36. Causal attention for interpretable and generalizable graph classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1696–1705.
  37. A subgraph matching algorithm based on subgraph index for knowledge graph. Frontiers of Computer Science 16 (2022), 1–18.
  38. The information bottleneck method. arXiv preprint physics/0004057 (2000).
  39. Generalizing to unseen domains: A survey on domain generalization. IEEE TKDE (2022).
  40. Aming Wu and Cheng Deng. 2022. Single-domain generalized object detection in urban scene via cyclic-disentangled self-distillation. In Proceedings of the IEEE/CVF Conference on computer vision and pattern recognition. 847–856.
  41. Graph neural networks in recommender systems: a survey. Comput. Surveys 55, 5 (2022), 1–37.
  42. Discovering Invariant Rationales for Graph Neural Networks. In ICLR.
  43. How Powerful are Graph Neural Networks?. In ICLR.
  44. Gnnexplainer: Generating explanations for graph neural networks. Advances in neural information processing systems 32 (2019).
  45. Spatio-Temporal Graph Convolutional Networks: A Deep Learning Framework for Traffic Forecasting. In Proceedings of the 27th International Joint Conference on Artificial Intelligence.
  46. DARE: Disentanglement-Augmented Rationale Extraction. In Advances in Neural Information Processing Systems, Vol. 35. 26603–26617.
  47. Boosting Selective Rationalization with Shortcuts Discovery. In ICLR.
  48. Interventional Rationalization. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing. 11404–11418.
  49. Dynamic graph neural networks under spatio-temporal distribution shift. Advances in Neural Information Processing Systems 35 (2022), 6074–6089.
  50. Learning to generate novel domains for domain generalization. In Computer Vision–ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XVI 16. Springer, 561–578.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Linan Yue (11 papers)
  2. Qi Liu (485 papers)
  3. Ye Liu (153 papers)
  4. Weibo Gao (64 papers)
  5. Fangzhou Yao (9 papers)
  6. Wenfeng Li (6 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets