Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
143 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Few-Shot Causal Representation Learning for Out-of-Distribution Generalization on Heterogeneous Graphs (2401.03597v3)

Published 7 Jan 2024 in cs.LG and cs.AI

Abstract: Heterogeneous graph few-shot learning (HGFL) has been developed to address the label sparsity issue in heterogeneous graphs (HGs), which consist of various types of nodes and edges. The core concept of HGFL is to extract knowledge from rich-labeled classes in a source HG, transfer this knowledge to a target HG to facilitate learning new classes with few-labeled training data, and finally make predictions on unlabeled testing data. Existing methods typically assume that the source HG, training data, and testing data all share the same distribution. However, in practice, distribution shifts among these three types of data are inevitable due to two reasons: (1) the limited availability of the source HG that matches the target HG distribution, and (2) the unpredictable data generation mechanism of the target HG. Such distribution shifts result in ineffective knowledge transfer and poor learning performance in existing methods, thereby leading to a novel problem of out-of-distribution (OOD) generalization in HGFL. To address this challenging problem, we propose a novel Causal OOD Heterogeneous graph Few-shot learning model, namely COHF. In COHF, we first characterize distribution shifts in HGs with a structural causal model, establishing an invariance principle for OOD generalization in HGFL. Then, following this invariance principle, we propose a new variational autoencoder-based heterogeneous graph neural network to mitigate the impact of distribution shifts. Finally, by integrating this network with a novel meta-learning framework, COHF effectively transfers knowledge to the target HG to predict new classes with few-labeled data. Extensive experiments on seven real-world datasets have demonstrated the superior performance of COHF over the state-of-the-art methods.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (57)
  1. C. Shi, Y. Li, J. Zhang, Y. Sun, and S. Y. Philip, “A survey of heterogeneous information network analysis,” IEEE TKDE, vol. 29, no. 1, pp. 17–37, 2016.
  2. C. Zhang, K. Ding, J. Li, X. Zhang, Y. Ye, N. V. Chawla, and H. Liu, “Few-shot learning on graphs: A survey,” arXiv preprint arXiv:2203.09308, 2022.
  3. M. Yoon, J. Palowitch, D. Zelle, Z. Hu, R. Salakhutdinov, and B. Perozzi, “Zero-shot transfer learning within a heterogeneous graph via knowledge transfer networks,” in NIPS, 2022, pp. 27 347–27 359.
  4. P. Ding, Y. Wang, and G. Liu, “Cross-heterogeneity graph few-shot learning,” in CIKM, 2023, pp. 420–429.
  5. Z. Zhuang, X. Xiang, S. Huang, and D. Wang, “Hinfshot: A challenge dataset for few-shot node classification in heterogeneous information network,” in ICMR, 2021, pp. 429–436.
  6. P. Ding, Y. Wang, G. Liu, and X. Zhou, “Few-shot semantic relation prediction across heterogeneous graphs,” IEEE TKDE, vol. 35, no. 10, pp. 10 265–10 280, 2023.
  7. Q. Zhang, X. Wu, Q. Yang, C. Zhang, and X. Zhang, “Hg-meta: Graph meta-learning over heterogeneous graphs,” in SDM, 2022, pp. 397–405.
  8. Y. Bengio, T. Deleu, N. Rahaman, R. Ke, S. Lachapelle, O. Bilaniuk, A. Goyal, and C. Pal, “A meta-transfer objective for learning to disentangle causal mechanisms,” arXiv preprint arXiv:1901.10912, 2019.
  9. M. Rojas-Carulla, B. Schölkopf, R. Turner, and J. Peters, “Invariant models for causal transfer learning,” JMLR, vol. 19, no. 1, pp. 1309–1342, 2018.
  10. Y. Lin, H. Dong, H. Wang, and T. Zhang, “Bayesian invariant risk minimization,” in CVPR, 2022, pp. 16 021–16 030.
  11. Y. Chen, Y. Zhang, Y. Bian, H. Yang, M. Kaili, B. Xie, T. Liu, B. Han, and J. Cheng, “Learning causally invariant representations for out-of-distribution generalization on graphs,” NIPS, vol. 35, pp. 22 131–22 148, 2022.
  12. S. Fan, X. Wang, Y. Mo, C. Shi, and J. Tang, “Debiasing graph neural networks via learning disentangled causal substructure,” NIPS, vol. 35, pp. 24 934–24 946, 2022.
  13. W. Wang, X. Lin, F. Feng, X. He, M. Lin, and T.-S. Chua, “Causal representation learning for out-of-distribution recommendation,” in WWW, 2022, pp. 3562–3571.
  14. C. Finn, P. Abbeel, and S. Levine, “Model-agnostic meta-learning for fast adaptation of deep networks,” in ICML, 2017, pp. 1126–1135.
  15. K. Huang and M. Zitnik, “Graph meta learning via local subgraphs,” NIPS, vol. 33, pp. 5862–5874, 2020.
  16. H. Hong, H. Guo, Y. Lin, X. Yang, Z. Li, and J. Ye, “An attention-based graph neural network for heterogeneous structural learning,” in AAAI, vol. 34, no. 04, 2020, pp. 4132–4139.
  17. Q. Lv, M. Ding, Q. Liu, Y. Chen, W. Feng, S. He, C. Zhou, J. Jiang, Y. Dong, and J. Tang, “Are we really making much progress? revisiting, benchmarking and refining heterogeneous graph neural networks,” in KDD, 2021, pp. 1150–1160.
  18. Z. Zhou, J. Shi, R. Yang, Y. Zou, and Q. Li, “Slotgat: slot-based message passing for heterogeneous graphs,” in ICML, 2023, pp. 42 644–42 657.
  19. X. Wang, H. Ji, C. Shi, B. Wang, Y. Ye, P. Cui, and P. S. Yu, “Heterogeneous graph attention network,” in WWW, 2019, pp. 2022–2032.
  20. X. Fu, J. Zhang, Z. Meng, and I. King, “Magnn: Metapath aggregated graph neural network for heterogeneous graph embedding,” in WWW, 2020, pp. 2331–2341.
  21. S. Yun, M. Jeong, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks,” NIPS, vol. 32, 2019.
  22. F. Zhou, C. Cao, K. Zhang, G. Trajcevski, T. Zhong, and J. Geng, “Meta-gnn: On few-shot node classification in graph meta-learning,” in CIKM, 2019, pp. 2357–2360.
  23. Q. Zhang, X. Wu, Q. Yang, C. Zhang, and X. Zhang, “Few-shot heterogeneous graph learning via cross-domain knowledge transfer,” in KDD, 2022, pp. 2450–2460.
  24. A. Baranwal, K. Fountoulakis, and A. Jagannath, “Graph convolution for semi-supervised classification: Improved linear separability and out-of-distribution generalization,” arXiv preprint arXiv:2102.06966, 2021.
  25. B. Bevilacqua, Y. Zhou, and B. Ribeiro, “Size-invariant graph representations for graph classification extrapolations,” in ICML, 2021, pp. 837–851.
  26. C.-Y. Chuang and S. Jegelka, “Tree mover’s distance: Bridging graph metrics and stability of graph neural networks,” NIPS, vol. 35, pp. 2944–2957, 2022.
  27. D. Krueger, E. Caballero, J.-H. Jacobsen, A. Zhang, J. Binas, D. Zhang, R. Le Priol, and A. Courville, “Out-of-distribution generalization via risk extrapolation (rex),” in ICML, 2021, pp. 5815–5826.
  28. Y.-X. Wu, X. Wang, A. Zhang, X. He, and T.-S. Chua, “Discovering invariant rationales for graph neural networks,” arXiv preprint arXiv:2201.12872, 2022.
  29. Y. Sui, X. Wang, J. Wu, M. Lin, X. He, and T.-S. Chua, “Causal attention for interpretable and generalizable graph classification,” in KDD, 2022, pp. 1696–1705.
  30. Y. Lu, C. Shi, L. Hu, and Z. Liu, “Relation structure-aware heterogeneous information network embedding,” in AAAI, 2019, pp. 4456–4463.
  31. C. Shi, Y. Lu, L. Hu, Z. Liu, and H. Ma, “Rhine: relation structure-aware heterogeneous information network embedding,” IEEE TKDE, vol. 34, no. 1, pp. 433–447, 2020.
  32. Y. Sun, J. Han, X. Yan, P. S. Yu, and T. Wu, “Pathsim: Meta path-based top-k similarity search in heterogeneous information networks,” PVLDB, vol. 4, no. 11, pp. 992–1003, 2011.
  33. Y. Fang, W. Lin, V. W. Zheng, M. Wu, J. Shi, K. C.-C. Chang, and X.-L. Li, “Metagraph-based learning on heterogeneous graphs,” IEEE TKDE, vol. 33, no. 1, pp. 154–168, 2019.
  34. X. Wang, D. Bo, C. Shi, S. Fan, Y. Ye, and S. Y. Philip, “A survey on heterogeneous graph embedding: methods, techniques, applications and sources,” IEEE Transactions on Big Data, vol. 9, no. 2, pp. 415–436, 2022.
  35. X. Li, D. Ding, B. Kao, Y. Sun, and N. Mamoulis, “Leveraging meta-path contexts for classification in heterogeneous information networks,” in ICDE, 2021, pp. 912–923.
  36. Y. Yang, Z. Guan, J. Li, W. Zhao, J. Cui, and Q. Wang, “Interpretable and efficient heterogeneous graph convolutional network,” IEEE TKDE, 2021.
  37. M. Zhang and Y. Chen, “Link prediction based on graph neural networks,” NIPS, vol. 31, 2018.
  38. H. Zeng, H. Zhou, A. Srivastava, R. Kannan, and V. Prasanna, “Graphsaint: Graph sampling based inductive learning method,” arXiv preprint arXiv:1907.04931, 2019.
  39. C. Donnat, M. Zitnik, D. Hallac, and J. Leskovec, “Learning structural node embeddings via diffusion wavelets,” in KDD, 2018, pp. 1320–1329.
  40. D. P. Kingma, S. Mohamed, D. Jimenez Rezende, and M. Welling, “Semi-supervised learning with deep generative models,” NIPS, vol. 27, 2014.
  41. D. P. Kingma and M. Welling, “Auto-encoding variational bayes,” arXiv preprint arXiv:1312.6114, 2013.
  42. M. Simonovsky and N. Komodakis, “Graphvae: Towards generation of small graphs using variational autoencoders,” in ICANN, 2018, pp. 412–422.
  43. H. Wang, C. Zhou, X. Chen, J. Wu, S. Pan, and J. Wang, “Graph stochastic neural networks for semi-supervised learning,” NIPS, vol. 33, pp. 19 839–19 848, 2020.
  44. J. Ma, W. Tang, J. Zhu, and Q. Mei, “A flexible generative framework for graph-based semi-supervised learning,” NIPS, vol. 32, 2019.
  45. T. Chen and Y. Sun, “Task-guided and path-augmented heterogeneous network embedding for author identification,” in WSDM, 2017, pp. 295–304.
  46. J. Gasteiger, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” arXiv preprint arXiv:1810.05997, 2018.
  47. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” NIPS, vol. 30, 2017.
  48. K. Ding, J. Wang, J. Li, K. Shu, C. Liu, and H. Liu, “Graph prototypical networks for few-shot learning on attributed networks,” in CIKM, 2020, pp. 295–304.
  49. J. Snell, K. Swersky, and R. Zemel, “Prototypical networks for few-shot learning,” in NIPS, 2017, pp. 4077–4087.
  50. B. N. Miller, I. Albert, S. K. Lam, J. A. Konstan, and J. Riedl, “Movielens unplugged: experiences with an occasionally connected recommender system,” in IUI, 2003, pp. 263–266.
  51. C. Yang, Y. Xiao, Y. Zhang, Y. Sun, and J. Han, “Heterogeneous network representation learning: A unified framework with survey and benchmark,” IEEE TKDE, vol. 34, no. 10, pp. 4854–4873, 2020.
  52. S. Gui, X. Li, L. Wang, and S. Ji, “Good: A graph out-of-distribution benchmark,” NIPS, vol. 35, pp. 2059–2073, 2022.
  53. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Lio, Y. Bengio et al., “Graph attention networks,” stat, vol. 1050, no. 20, pp. 10–48 550, 2017.
  54. F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in ICML, 2019, pp. 6861–6871.
  55. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
  56. D. Lao, X. Yang, Q. Wu, and J. Yan, “Variational inference for training graph neural networks in low-data regime through joint structure-label estimation,” in KDD, 2022, pp. 824–834.
  57. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
Citations (2)

Summary

  • The paper introduces COHF, a novel model employing structural causal modeling and meta-learning to tackle few-shot learning on heterogeneous graphs under distribution shifts.
  • It combines a variational autoencoder-based HGNN with causal inference to extract invariant representations and ensure reliability in OOD scenarios.
  • Extensive experiments validate that COHF outperforms state-of-the-art methods by maintaining high accuracy even with minimal labeled data.

Overview of COHF Model for Learning on Heterogeneous Graphs

Heterogeneous graphs, with their rich variety of node types and edges, offer a robust model for representing complex systems. However, they also present unique challenges for machine learning, owing to the difficulty of obtaining sufficient labeled data to train effective models. This challenge is exacerbated when there are shifts in data distribution, a common occurrence in real-world scenarios. Researchers have devised a novel model, COHF (Causal OOD Heterogeneous graph Few-shot learning model), to tackle this problem by facilitating knowledge transfer from a source graph to a target graph, even in the presence of distribution shifts.

The Problem of Few-Shot Learning on Heterogeneous Graphs

Heterogeneous graph few-shot learning (HGFL) addresses the scarcity of labeled data by leveraging knowledge from a source graph to learn about new classes in a target graph with very few labeled examples. Notably, standard HGFL methods operate under the assumption that data follows an independently and identically distributed (I.I.D.) model, which is unrealistic in many real-world applications. To bridge this gap, COHF focuses on out-of-distribution (OOD) generalization in HGFL, targeting the crucial task of maintaining performance despite distribution shifts.

Key Innovations in COHF

The COHF model is grounded in causal inference principles, aiming to discern an invariant principle that would enable effective knowledge transfer and reliable predictions amidst distribution changes. Here's what sets COHF apart:

  1. Structural Causal Model (SCM): COHF introduces an SCM that characterizes the label-generating process within heterogeneous graphs. This model helps to identify invariant factors that remain stable across different distributions, which are key to learning robust representations.
  2. Variational Autoencoder-based HGNN (VAE-HGNN): Built on the SCM, the VAE-HGNN component of COHF extracts factors from the source graph that are resistant to distribution shifts. This is accomplished via a graph neural network that essentially filters out the noise resulting from changing distributions, focusing instead on the consistent elements vital for accurate classification.
  3. Meta-Learning Integration: COHF employs a novel meta-learning approach, allowing the model to effectively evaluate and prioritize the most informative few-labeled samples in the target graph. These representations are crucial for making precise predictions in the target graph.

COHF's Superior Performance

A series of rigorous experiments on real-world datasets highlighted the superior performance of the COHF model compared to several state-of-the-art baselines. Even when faced with significant distribution shifts, COHF was capable of maintaining a high level of accuracy, underlining its robustness and ability to perform well in OOD scenarios.

Looking Forward

COHF provides a compelling solution to the challenges posed by few-shot learning on heterogeneous graphs, particularly in OOD settings. Future explorations might take COHF into other domains of graph learning, potentially expanding its applicability and reinforcing its status as a potent tool for dealing with distribution shift challenges in graph-based data analysis.

Overall, COHF represents a meaningful advancement in the quest for better generalization in heterogeneous graph learning, and it sets the stage for further advancements in the field.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets