Can GNN be Good Adapter for LLMs? (2402.12984v1)
Abstract: Recently, LLMs have demonstrated superior capabilities in understanding and zero-shot learning on textual data, promising significant advances for many text-related domains. In the graph domain, various real-world scenarios also involve textual data, where tasks and node features can be described by text. These text-attributed graphs (TAGs) have broad applications in social media, recommendation systems, etc. Thus, this paper explores how to utilize LLMs to model TAGs. Previous methods for TAG modeling are based on million-scale LMs. When scaled up to billion-scale LLMs, they face huge challenges in computational costs. Additionally, they also ignore the zero-shot inference capabilities of LLMs. Therefore, we propose GraphAdapter, which uses a graph neural network (GNN) as an efficient adapter in collaboration with LLMs to tackle TAGs. In terms of efficiency, the GNN adapter introduces only a few trainable parameters and can be trained with low computation costs. The entire framework is trained using auto-regression on node text (next token prediction). Once trained, GraphAdapter can be seamlessly fine-tuned with task-specific prompts for various downstream tasks. Through extensive experiments across multiple real-world TAGs, GraphAdapter based on Llama 2 gains an average improvement of approximately 5\% in terms of node classification. Furthermore, GraphAdapter can also adapt to other LLMs, including RoBERTa, GPT-2. The promising results demonstrate that GNNs can serve as effective adapters for LLMs in TAG modeling.
- Claude Berge. 2001. The theory of graphs. In Courier Corporation.
- Language models are few-shot learners. In NeurIPS’20, Vol. 33. 1877–1901.
- Exploring the potential of large language models (llms) in learning on graphs. In arXiv preprint arXiv:2307.03393.
- Node feature extraction by self-supervised multi-scale neighborhood prediction. In arXiv preprint arXiv:2111.00064.
- Text and structural data mining of influenza mentions in web and social media. In International journal of environmental research and public health, Vol. 7. 596–615.
- Bert: pre-training of deep bidirectional transformers for language understanding. In arXiv preprint arXiv:1810.04805.
- Simteg: A frustratingly simple approach improves textual graph learning. In arXiv preprint arXiv:2308.02565.
- Simcse: simple contrastive learning of sentence embeddings. In arXiv preprint arXiv:2104.08821.
- Predict then propagate: Graph neural networks meet personalized pagerank. In arXiv preprint arXiv:1810.05997.
- GPT4Graph: can large language models understand graph structured Data? An Empirical Evaluation and Benchmarking. In arXiv preprint arXiv:2305.15066.
- Inductive representation learning on large graphs. In NeurIPS’17, Vol. 30.
- Deberta: decoding-enhanced bert with disentangled attention. In arXiv preprint arXiv:2006.03654.
- Explanations as Features: LLM-Based Features for text-attributed graphs. In arXiv preprint arXiv:2305.19523.
- Lora: Low-rank adaptation of large language models. In arXiv preprint arXiv:2106.09685.
- Open graph benchmark: datasets for machine learning on graphs. In NeurIPS’20, Vol. 33. 22118–22133.
- Promptbert: improving bert sentence embeddings with prompts. In arXiv preprint arXiv:2201.04337.
- Patton: language model pretraining on text-Rich networks. In arXiv preprint arXiv:2305.12268.
- Multimodal post attentive profiling for influencer marketing. In WWW’20. 2878–2884.
- The power of scale for parameter-efficient prompt tuning. In arXiv preprint arXiv:2104.08691.
- Adsgnn: behavior-graph augmented relevance modeling in sponsored search. In SIGIR’21. 223–232.
- Training graph neural networks with 1000 layers. In ICML’21. 6437–6449.
- Xiang Lisa Li and Percy Liang. 2021. Prefix-tuning: optimizing continuous prompts for generation. In arXiv preprint arXiv:2101.00190.
- P-tuning v2: prompt tuning can be comparable to fine-tuning universally across scales and tasks. In arXiv preprint arXiv:2110.07602.
- Roberta: A robustly optimized bert pretraining approach. In arXiv preprint arXiv:1907.11692.
- Fine-grained fact verification with kernel graph attention network. In arXiv preprint arXiv:1910.09796.
- Train your own GNN teacher: graph-aware distillation on textual graphs. In arXiv preprint arXiv:2304.10668.
- Random graph models of social networks. In Proceedings of the national academy of Sciences, Vol. 99. 2566–2572.
- Improving language understanding by generative pre-training. In OpenAI.
- Language models are unsupervised multitask learners. In OpenAI.
- Nils Reimers and Iryna Gurevych. 2019. Sentence-bert: sentence embeddings using siamese bert-networks. In arXiv preprint arXiv:1908.10084.
- Ernie 2.0: a continual pre-training framework for language understanding. In AAAI’20, Vol. 34. 8968–8975.
- Graph neural prompting with large language models. In arXiv preprint arXiv:2309.15427.
- Llama 2: Open foundation and fine-tuned chat models. In arXiv preprint arXiv:2307.09288.
- Attention is all you need. In NeurIPS’17, Vol. 30.
- Graph attention networks. In arXiv preprint arXiv:1710.10903.
- How powerful are graph neural networks?. In arXiv preprint arXiv:1810.00826.
- GraphFormers: GNN-nested transformers for representation learning on textual graph. In NeurIPS’21, Vol. 34. 28798–28810.
- Yiping Yang and Xiaohui Cui. 2021. Bert-enhanced text graph neural network for classification. In Entropy, Vol. 23. 1–1.
- Graph convolutional neural networks for web-scale recommender systems. In KDD’18. 974–983.
- Shuzhou Yuan and Michael Färber. 2023. Evaluating generative models for graph-to-text generation. In arXiv preprint arXiv:2307.14712.
- Glm-130b: an open bilingual pre-trained model. In arXiv preprint arXiv:2210.02414.
- Learning on Large-scale text-attributed graphs via variational inference. In arXiv preprint arXiv:2210.14709.
- Semantic understanding of scenes through the ade20k dataset. In IJCV’19, Vol. 127. 302–321.
- Xuanwen Huang (11 papers)
- Kaiqiao Han (8 papers)
- Yang Yang (883 papers)
- Dezheng Bao (4 papers)
- Quanjin Tao (3 papers)
- Ziwei Chai (8 papers)
- Qi Zhu (160 papers)