LPNL: Scalable Link Prediction with Large Language Models
Abstract: Exploring the application of LLMs to graph learning is a emerging endeavor. However, the vast amount of information inherent in large graphs poses significant challenges to this process. This work focuses on the link prediction task and introduces $\textbf{LPNL}$ (Link Prediction via Natural Language), a framework based on LLMs designed for scalable link prediction on large-scale heterogeneous graphs. We design novel prompts for link prediction that articulate graph details in natural language. We propose a two-stage sampling pipeline to extract crucial information from the graphs, and a divide-and-conquer strategy to control the input tokens within predefined limits, addressing the challenge of overwhelming information. We fine-tune a T5 model based on our self-supervised learning designed for link prediction. Extensive experimental results demonstrate that LPNL outperforms multiple advanced baselines in link prediction tasks on large-scale graphs.
- Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In international conference on machine learning, pages 21–29. PMLR.
- Jon Louis Bentley. 1980. Multidimensional divide-and-conquer. Communications of the ACM, 23(4):214–229.
- Scaling graph neural networks with approximate pagerank. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 2464–2473.
- Language models are few-shot learners. Advances in neural information processing systems, 33:1877–1901.
- Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203.
- Chen Cai and Yusu Wang. 2020. A note on over-smoothing for graph neural networks. arXiv preprint arXiv:2006.13318.
- Line graph neural networks for link prediction. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44(9):5103–5113.
- Scaling instruction-finetuned language models. arXiv preprint arXiv:2210.11416.
- Applications of link prediction in social networks: A review. Journal of Network and Computer Applications, 166:102716.
- Convolutional neural networks on graphs with fast localized spectral filtering. Advances in neural information processing systems, 29.
- Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805.
- Heterogeneous network representation learning. In IJCAI, volume 20, pages 4861–4867.
- Talk like a graph: Encoding graphs for large language models. arXiv preprint arXiv:2310.04560.
- A brief survey of automatic methods for author name disambiguation. Acm Sigmod Record, 41(2):15–26.
- Inductive representation learning on large graphs. Advances in neural information processing systems, 30.
- Heterogeneous graph transformer. In Proceedings of the web conference 2020, pages 2704–2710.
- An analysis framework of research frontiers based on the large-scale open academic graph. Proceedings of the Association for Information Science and Technology, 57(1):e307.
- Prodigy: Enabling in-context learning over graphs. arXiv preprint arXiv:2305.12600.
- Efficient and effective training of language and graph neural network models. arXiv preprint arXiv:2206.10781.
- Pre-training on large-scale heterogeneous graph. In Proceedings of the 27th ACM SIGKDD conference on knowledge discovery & data mining, pages 756–766.
- Thomas N Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907.
- Academic social networks: Modeling, analysis, mining and applications. Journal of Network and Computer Applications, 132:86–103.
- Jure Leskovec and Christos Faloutsos. 2006. Sampling from large graphs. In Proceedings of the 12th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 631–636.
- Hang Li. 2022. Learning to rank for information retrieval and natural language processing. Springer Nature.
- Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32.
- Holistic evaluation of language models. arXiv preprint arXiv:2211.09110.
- New perspectives and methods in link prediction. In Proceedings of the 16th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 243–252.
- One for all: Towards training one graph model for all classification tasks. arXiv preprint arXiv:2310.00149.
- Tie-Yan Liu et al. 2009. Learning to rank for information retrieval. Foundations and Trends® in Information Retrieval, 3(3):225–331.
- K-bert: Enabling language representation with knowledge graph. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 2901–2908.
- Oag-bert: Towards a unified backbone language model for academic knowledge services. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 3418–3428.
- Pre-trained models for natural language processing: A survey. Science China Technological Sciences, 63(10):1872–1897.
- Exploring the limits of transfer learning with a unified text-to-text transformer. The Journal of Machine Learning Research, 21(1):5485–5551.
- An introduction to exponential random graph (p*) models for social networks. Social networks, 29(2):173–191.
- Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15, pages 593–607. Springer.
- A survey of heterogeneous information network analysis. IEEE Transactions on Knowledge and Data Engineering, 29(1):17–37.
- Douglas R Smith. 1987. Applications of a strategy for designing divide-and-conquer algorithms. Science of Computer Programming, 8(3):213–229.
- Frank Spitzer. 2013. Principles of random walk, volume 34. Springer Science & Business Media.
- All in one: Multi-task prompting for graph neural networks.
- Understanding the capabilities, limitations, and societal impact of large language models. arXiv preprint arXiv:2102.02503.
- Preserving personalized pagerank in subgraphs. In ICML, volume 11, pages 793–800.
- Graph attention networks. arXiv preprint arXiv:1710.10903.
- A survey on heterogeneous graph embedding: methods, techniques, applications and sources. IEEE Transactions on Big Data.
- Heterogeneous graph attention network. In The world wide web conference, pages 2022–2032.
- Unifying the global and local approaches: an efficient power iteration with forward push. In Proceedings of the 2021 International Conference on Management of Data, pages 1996–2008.
- A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24.
- Natural language is all a graph needs. arXiv preprint arXiv:2308.07134.
- Muhan Zhang and Yixin Chen. 2018. Link prediction based on graph neural networks. Advances in neural information processing systems, 31.
- Evaluating deep graph neural networks. arXiv preprint arXiv:2108.00955.
- Ernie: Enhanced language representation with informative entities. arXiv preprint arXiv:1905.07129.
- Learning on large-scale text-attributed graphs via variational inference. arXiv preprint arXiv:2210.14709.
- Graph neural networks: A review of methods and applications. AI open, 1:57–81.
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.