ZeroG: Investigating Cross-dataset Zero-shot Transferability in Graphs (2402.11235v2)
Abstract: With the development of foundation models such as LLMs, zero-shot transfer learning has become increasingly significant. This is highlighted by the generative capabilities of NLP models like GPT-4, and the retrieval-based approaches of CV models like CLIP, both of which effectively bridge the gap between seen and unseen data. In the realm of graph learning, the continuous emergence of new graphs and the challenges of human labeling also amplify the necessity for zero-shot transfer learning, driving the exploration of approaches that can generalize across diverse graph data without necessitating dataset-specific and label-specific fine-tuning. In this study, we extend such paradigms to zero-shot transferability in graphs by introducing ZeroG, a new framework tailored to enable cross-dataset generalization. Addressing the inherent challenges such as feature misalignment, mismatched label spaces, and negative transfer, we leverage a LLM to encode both node attributes and class semantics, ensuring consistent feature dimensions across datasets. We also propose a prompt-based subgraph sampling module that enriches the semantic information and structure information of extracted subgraphs using prompting nodes and neighborhood aggregation, respectively. We further adopt a lightweight fine-tuning strategy that reduces the risk of overfitting and maintains the zero-shot learning efficacy of the LLM. The results underscore the effectiveness of our model in achieving significant cross-dataset zero-shot transferability, opening pathways for the development of graph foundation models. Codes and data are available at https://github.com/NineAbyss/ZeroG.
- Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).
- Can we trust the evaluation on ChatGPT? arXiv preprint arXiv:2303.12767 (2023).
- ConGraT: Self-Supervised Contrastive Pretraining for Joint Graph and Text Embeddings. arXiv preprint arXiv:2305.14321 (2023).
- Large language models meet harry potter: A dataset for aligning dialogue agents with characters. In Findings of the Association for Computational Linguistics: EMNLP 2023. 8506–8520.
- Exploring the potential of large language models (llms) in learning on graphs. arXiv preprint arXiv:2307.03393 (2023).
- Wiener Graph Deconvolutional Network Improves Graph Self-Supervised Learning. In AAAI. 7131–7139.
- Node feature extraction by self-supervised multi-scale neighborhood prediction. arXiv preprint arXiv:2111.00064 (2021).
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).
- Simteg: A frustratingly simple approach improves textual graph learning. arXiv preprint arXiv:2308.02565 (2023).
- Talk like a graph: Encoding graphs for large language models. arXiv preprint arXiv:2310.04560 (2023).
- Hierarchical graph learning for protein–protein interaction. Nature Communications 14, 1 (2023), 1093.
- Protein Multimer Structure Prediction via PPI-guided Prompt Learning. In The Twelfth International Conference on Learning Representations.
- CiteSeer: An automatic citation indexing system. In Proceedings of the third ACM conference on Digital libraries. 89–98.
- Explanations as Features: LLM-Based Features for Text-Attributed Graphs. arXiv preprint arXiv:2305.19523 (2023).
- Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 594–604.
- LoRA: Low-Rank Adaptation of Large Language Models. In International Conference on Learning Representations.
- Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems 33 (2020), 22118–22133.
- Can llms effectively leverage graph structural information: when and why. arXiv preprint arXiv:2309.16595 (2023).
- Self-supervised learning on graphs: Deep insights and new direction. arXiv preprint arXiv:2006.10141 (2020).
- Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
- Semi-supervised graph classification: A hierarchical graph perspective. In The World Wide Web Conference. 972–982.
- A survey of graph meets large language model: Progress and future directions. arXiv preprint arXiv:2311.12399 (2023).
- The Devil is in the Conflict: Disentangled Information Graph Neural Networks for Fraud Detection. In 2022 IEEE International Conference on Data Mining (ICDM). IEEE, 1059–1064.
- GSLB: The Graph Structure Learning Benchmark. arXiv preprint arXiv:2310.05174 (2023).
- One for All: Towards Training One Graph Model for All Classification Tasks. arXiv preprint arXiv:2310.00149 (2023).
- Towards graph foundation models: A survey and beyond. arXiv preprint arXiv:2310.11829 (2023).
- Multi-modal molecule structure–text model for text-based retrieval and editing. Nature Machine Intelligence 5, 12 (2023), 1447–1457.
- Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).
- Graphprompt: Unifying pre-training and downstream tasks for graph neural networks. In Proceedings of the ACM Web Conference 2023. 417–428.
- Automating the construction of internet portals with machine learning. Information Retrieval 3 (2000), 127–163.
- Péter Mernyei and Cătălina Cangea. 2020. Wiki-cs: A wikipedia-based benchmark for graph neural networks. arXiv preprint arXiv:2007.02901 (2020).
- Distributed representations of words and phrases and their compositionality. Advances in neural information processing systems 26 (2013).
- Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748–8763.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:1908.10084 (2019).
- Gerard Salton and Christopher Buckley. 1988. Term-weighting approaches in automatic text retrieval. Information processing & management 24, 5 (1988), 513–523.
- Collective classification in network data. AI magazine 29, 3 (2008), 93–93.
- A molecular multimodal foundation model associating molecule graphs with natural language. arXiv preprint arXiv:2209.05481 (2022).
- Gppt: Graph pre-training and prompt tuning to generalize graph neural networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 1717–1727.
- All in One: Multi-Task Prompting for Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining (KDD’23). 2120–2131.
- Graph Prompt Learning: A Comprehensive Survey and Beyond. arXiv preprint arXiv:2311.16534 (2023).
- Walklm: A uniform language model fine-tuning framework for attributed graph embedding. In Thirty-seventh Conference on Neural Information Processing Systems.
- GADBench: Revisiting and Benchmarking Supervised Graph Anomaly Detection. In Thirty-seventh Conference on Neural Information Processing Systems.
- Rethinking graph neural networks for anomaly detection. In International Conference on Machine Learning. PMLR, 21076–21089.
- Graphgpt: Graph instruction tuning for large language models. arXiv preprint arXiv:2310.13023 (2023).
- Llama: Open and efficient foundation language models. arXiv preprint arXiv:2302.13971 (2023).
- Laurens Van der Maaten and Geoffrey Hinton. 2008. Visualizing data using t-SNE. Journal of machine learning research 9, 11 (2008).
- Graph attention networks. arXiv preprint arXiv:1710.10903 (2017).
- Deep graph infomax. arXiv preprint arXiv:1809.10341 (2018).
- Can Language Models Solve Graph Problems in Natural Language? arXiv preprint arXiv:2305.10037 (2023).
- Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533 (2022).
- Llmrec: Large language models with graph augmentation for recommendation. arXiv preprint arXiv:2311.00423 (2023).
- Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning. PMLR, 40–48.
- Natural language is all a graph needs. arXiv preprint arXiv:2308.07134 (2023).
- Graph contrastive learning with augmentations. Advances in neural information processing systems 33 (2020), 5812–5823.
- Graphtext: Graph reasoning in text space. arXiv preprint arXiv:2310.01089 (2023).
- Graph contrastive learning with adaptive augmentation. In Proceedings of the Web Conference 2021. 2069–2080.
- Yuhan Li (49 papers)
- Peisong Wang (24 papers)
- Zhixun Li (17 papers)
- Jeffrey Xu Yu (47 papers)
- Jia Li (380 papers)