Few-Shot Causal Representation Learning for Out-of-Distribution Generalization on Heterogeneous Graphs (2401.03597v3)
Abstract: Heterogeneous graph few-shot learning (HGFL) has been developed to address the label sparsity issue in heterogeneous graphs (HGs), which consist of various types of nodes and edges. The core concept of HGFL is to extract knowledge from rich-labeled classes in a source HG, transfer this knowledge to a target HG to facilitate learning new classes with few-labeled training data, and finally make predictions on unlabeled testing data. Existing methods typically assume that the source HG, training data, and testing data all share the same distribution. However, in practice, distribution shifts among these three types of data are inevitable due to two reasons: (1) the limited availability of the source HG that matches the target HG distribution, and (2) the unpredictable data generation mechanism of the target HG. Such distribution shifts result in ineffective knowledge transfer and poor learning performance in existing methods, thereby leading to a novel problem of out-of-distribution (OOD) generalization in HGFL. To address this challenging problem, we propose a novel Causal OOD Heterogeneous graph Few-shot learning model, namely COHF. In COHF, we first characterize distribution shifts in HGs with a structural causal model, establishing an invariance principle for OOD generalization in HGFL. Then, following this invariance principle, we propose a new variational autoencoder-based heterogeneous graph neural network to mitigate the impact of distribution shifts. Finally, by integrating this network with a novel meta-learning framework, COHF effectively transfers knowledge to the target HG to predict new classes with few-labeled data. Extensive experiments on seven real-world datasets have demonstrated the superior performance of COHF over the state-of-the-art methods.
- C. Shi, Y. Li, J. Zhang, Y. Sun, and S. Y. Philip, “A survey of heterogeneous information network analysis,” IEEE TKDE, vol. 29, no. 1, pp. 17–37, 2016.
- C. Zhang, K. Ding, J. Li, X. Zhang, Y. Ye, N. V. Chawla, and H. Liu, “Few-shot learning on graphs: A survey,” arXiv preprint arXiv:2203.09308, 2022.
- M. Yoon, J. Palowitch, D. Zelle, Z. Hu, R. Salakhutdinov, and B. Perozzi, “Zero-shot transfer learning within a heterogeneous graph via knowledge transfer networks,” in NIPS, 2022, pp. 27 347–27 359.
- P. Ding, Y. Wang, and G. Liu, “Cross-heterogeneity graph few-shot learning,” in CIKM, 2023, pp. 420–429.
- Z. Zhuang, X. Xiang, S. Huang, and D. Wang, “Hinfshot: A challenge dataset for few-shot node classification in heterogeneous information network,” in ICMR, 2021, pp. 429–436.
- P. Ding, Y. Wang, G. Liu, and X. Zhou, “Few-shot semantic relation prediction across heterogeneous graphs,” IEEE TKDE, vol. 35, no. 10, pp. 10 265–10 280, 2023.
- Q. Zhang, X. Wu, Q. Yang, C. Zhang, and X. Zhang, “Hg-meta: Graph meta-learning over heterogeneous graphs,” in SDM, 2022, pp. 397–405.
- Y. Bengio, T. Deleu, N. Rahaman, R. Ke, S. Lachapelle, O. Bilaniuk, A. Goyal, and C. Pal, “A meta-transfer objective for learning to disentangle causal mechanisms,” arXiv preprint arXiv:1901.10912, 2019.
- M. Rojas-Carulla, B. Schölkopf, R. Turner, and J. Peters, “Invariant models for causal transfer learning,” JMLR, vol. 19, no. 1, pp. 1309–1342, 2018.
- Y. Lin, H. Dong, H. Wang, and T. Zhang, “Bayesian invariant risk minimization,” in CVPR, 2022, pp. 16 021–16 030.
- Y. Chen, Y. Zhang, Y. Bian, H. Yang, M. Kaili, B. Xie, T. Liu, B. Han, and J. Cheng, “Learning causally invariant representations for out-of-distribution generalization on graphs,” NIPS, vol. 35, pp. 22 131–22 148, 2022.
- S. Fan, X. Wang, Y. Mo, C. Shi, and J. Tang, “Debiasing graph neural networks via learning disentangled causal substructure,” NIPS, vol. 35, pp. 24 934–24 946, 2022.
- W. Wang, X. Lin, F. Feng, X. He, M. Lin, and T.-S. Chua, “Causal representation learning for out-of-distribution recommendation,” in WWW, 2022, pp. 3562–3571.
- C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in ICML, 2017, pp. 1126–1135.
- K. Huang and M. Zitnik, “Graph meta learning via local subgraphs,” NIPS, vol. 33, pp. 5862–5874, 2020.
- H. Hong, H. Guo, Y. Lin, X. Yang, Z. Li, and J. Ye, “An attention-based graph neural network for heterogeneous structural learning,” in AAAI, vol. 34, no. 04, 2020, pp. 4132–4139.
- Q. Lv, M. Ding, Q. Liu, Y. Chen, W. Feng, S. He, C. Zhou, J. Jiang, Y. Dong, and J. Tang, “Are we really making much progress? revisiting, benchmarking and refining heterogeneous graph neural networks,” in KDD, 2021, pp. 1150–1160.
- Z. Zhou, J. Shi, R. Yang, Y. Zou, and Q. Li, “Slotgat: slot-based message passing for heterogeneous graphs,” in ICML, 2023, pp. 42 644–42 657.
- X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graph attention network,” in WWW, 2019, pp. 2022–2032.
- X. Fu, J. Zhang, Z. Meng, and I. King, “Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding,” in WWW, 2020, pp. 2331–2341.
- S. Yun, M. Jeong, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks,” NIPS, vol. 32, 2019.
- F. Zhou, C. Cao, K. Zhang, G. Trajcevski, T. Zhong, and J. Geng, “Meta-gnn: On few-shot node classification in graph meta-learning,” in CIKM, 2019, pp. 2357–2360.
- Q. Zhang, X. Wu, Q. Yang, C. Zhang, and X. Zhang, “Few-shot heterogeneous graph learning via cross-domain knowledge transfer,” in KDD, 2022, pp. 2450–2460.
- A. Baranwal, K. Fountoulakis, and A. Jagannath, “Graph convolution for semi-supervised classification: Improved linear separability and out-of-distribution generalization,” arXiv preprint arXiv:2102.06966, 2021.
- B. Bevilacqua, Y. Zhou, and B. Ribeiro, “Size-invariant graph representations for graph classification extrapolations,” in ICML, 2021, pp. 837–851.
- C.-Y. Chuang and S. Jegelka, “Tree mover’s distance: Bridging graph metrics and stability of graph neural networks,” NIPS, vol. 35, pp. 2944–2957, 2022.
- D. Krueger, E. Caballero, J.-H. Jacobsen, A. Zhang, J. Binas, D. Zhang, R. Le Priol, and A. Courville, “Out-of-distribution generalization via risk extrapolation (rex),” in ICML, 2021, pp. 5815–5826.
- Y.-X. Wu, X. Wang, A. Zhang, X. He, and T.-S. Chua, “Discovering invariant rationales for graph neural networks,” arXiv preprint arXiv:2201.12872, 2022.
- Y. Sui, X. Wang, J. Wu, M. Lin, X. He, and T.-S. Chua, “Causal attention for interpretable and generalizable graph classification,” in KDD, 2022, pp. 1696–1705.
- Y. Lu, C. Shi, L. Hu, and Z. Liu, “Relation structure-aware heterogeneous information network embedding,” in AAAI, 2019, pp. 4456–4463.
- C. Shi, Y. Lu, L. Hu, Z. Liu, and H. Ma, “Rhine: relation structure-aware heterogeneous information network embedding,” IEEE TKDE, vol. 34, no. 1, pp. 433–447, 2020.
- Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu, “Pathsim: Meta path-based top-k similarity search in heterogeneous information networks,” PVLDB, vol. 4, no. 11, pp. 992–1003, 2011.
- Y. Fang, W. Lin, V. W. Zheng, M. Wu, J. Shi, K. C.-C. Chang, and X.-L. Li, “Metagraph-based learning on heterogeneous graphs,” IEEE TKDE, vol. 33, no. 1, pp. 154–168, 2019.
- X. Wang, D. Bo, C. Shi, S. Fan, Y. Ye, and S. Y. Philip, “A survey on heterogeneous graph embedding: methods, techniques, applications and sources,” IEEE Transactions on Big Data, vol. 9, no. 2, pp. 415–436, 2022.
- X. Li, D. Ding, B. Kao, Y. Sun, and N. Mamoulis, “Leveraging meta-path contexts for classification in heterogeneous information networks,” in ICDE, 2021, pp. 912–923.
- Y. Yang, Z. Guan, J. Li, W. Zhao, J. Cui, and Q. Wang, “Interpretable and efficient heterogeneous graph convolutional network,” IEEE TKDE, 2021.
- M. Zhang and Y. Chen, “Link prediction based on graph neural networks,” NIPS, vol. 31, 2018.
- H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V. Prasanna, “Graphsaint: Graph sampling based inductive learning method,” arXiv preprint arXiv:1907.04931, 2019.
- C. Donnat, M. Zitnik, D. Hallac, and J. Leskovec, “Learning structural node embeddings via diffusion wavelets,” in KDD, 2018, pp. 1320–1329.
- D. P. Kingma, S. Mohamed, D. Jimenez Rezende, and M. Welling, “Semi-supervised learning with deep generative models,” NIPS, vol. 27, 2014.
- D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
- M. Simonovsky and N. Komodakis, “Graphvae: Towards generation of small graphs using variational autoencoders,” in ICANN, 2018, pp. 412–422.
- H. Wang, C. Zhou, X. Chen, J. Wu, S. Pan, and J. Wang, “Graph stochastic neural networks for semi-supervised learning,” NIPS, vol. 33, pp. 19 839–19 848, 2020.
- J. Ma, W. Tang, J. Zhu, and Q. Mei, “A flexible generative framework for graph-based semi-supervised learning,” NIPS, vol. 32, 2019.
- T. Chen and Y. Sun, “Task-guided and path-augmented heterogeneous network embedding for author identification,” in WSDM, 2017, pp. 295–304.
- J. Gasteiger, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” arXiv preprint arXiv:1810.05997, 2018.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” NIPS, vol. 30, 2017.
- K. Ding, J. Wang, J. Li, K. Shu, C. Liu, and H. Liu, “Graph prototypical networks for few-shot learning on attributed networks,” in CIKM, 2020, pp. 295–304.
- J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in NIPS, 2017, pp. 4077–4087.
- B. N. Miller, I. Albert, S. K. Lam, J. A. Konstan, and J. Riedl, “Movielens unplugged: experiences with an occasionally connected recommender system,” in IUI, 2003, pp. 263–266.
- C. Yang, Y. Xiao, Y. Zhang, Y. Sun, and J. Han, “Heterogeneous network representation learning: A unified framework with survey and benchmark,” IEEE TKDE, vol. 34, no. 10, pp. 4854–4873, 2020.
- S. Gui, X. Li, L. Wang, and S. Ji, “Good: A graph out-of-distribution benchmark,” NIPS, vol. 35, pp. 2059–2073, 2022.
- P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio et al., “Graph attention networks,” stat, vol. 1050, no. 20, pp. 10–48 550, 2017.
- F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in ICML, 2019, pp. 6861–6871.
- K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
- D. Lao, X. Yang, Q. Wu, and J. Yan, “Variational inference for training graph neural networks in low-data regime through joint structure-label estimation,” in KDD, 2022, pp. 824–834.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.