One for All: Towards Training One Graph Model for All Classification Tasks (2310.00149v3)
Abstract: Designing a single model to address multiple tasks has been a long-standing objective in artificial intelligence. Recently, LLMs have demonstrated exceptional capability in solving different tasks within the language domain. However, a unified model for various graph tasks remains underexplored, primarily due to the challenges unique to the graph learning domain. First, graph data from different areas carry distinct attributes and follow different distributions. Such discrepancy makes it hard to represent graphs in a single representation space. Second, tasks on graphs diversify into node, link, and graph tasks, requiring distinct embedding strategies. Finally, an appropriate graph prompting paradigm for in-context learning is unclear. We propose \textbf{One for All (OFA)}, the first general framework that can use a single graph model to address the above challenges. Specifically, OFA proposes text-attributed graphs to unify different graph data by describing nodes and edges with natural language and uses LLMs to encode the diverse and possibly cross-domain text attributes to feature vectors in the same embedding space. Furthermore, OFA introduces the concept of nodes-of-interest to standardize different tasks with a single task representation. For in-context learning on graphs, OFA introduces a novel graph prompting paradigm that appends prompting substructures to the input graph, which enables it to address varied tasks without fine-tuning. We train the OFA model using graph data from multiple domains (including citation networks, molecular graphs, knowledge graphs, etc.) simultaneously and evaluate its ability in supervised, few-shot, and zero-shot learning scenarios. OFA performs well across different tasks, making it the first general-purpose across-domains classification model on graphs.
- On the opportunities and risks of foundation models. ArXiv, abs/2108.07258, 2021. URL https://api.semanticscholar.org/CorpusID:237091588.
- Language models are few-shot learners. In H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (eds.), Advances in Neural Information Processing Systems, volume 33, pp. 1877–1901. Curran Associates, Inc., 2020. URL https://proceedings.neurips.cc/paper_files/paper/2020/file/1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf.
- Exploring the potential of large language models (llms) in learning on graphs, 2023.
- Adaptive universal generalized pagerank graph neural network. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=n6jl7fLxrP.
- Node feature extraction by self-supervised multi-scale neighborhood prediction. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=KJggliHbs8.
- Palm: Scaling language modeling with pathways. arXiv preprint arXiv:2204.02311, 2022.
- Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in Neural Information Processing Systems, 2016. URL https://arxiv.org/abs/1606.09375.
- Convolutional 2d knowledge graph embeddings. In Proceedings of the 32th AAAI Conference on Artificial Intelligence, pp. 1811–1818, February 2018. URL https://arxiv.org/abs/1707.01476.
- Graph prototypical networks for few-shot learning on attributed networks. In Proceedings of the 29th ACM International Conference on Information & Knowledge Management, pp. 295–304, 2020.
- A survey on in-context learning, 2023.
- Benchmarking graph neural networks. arXiv preprint arXiv:2003.00982, 2020.
- Extending the design space of graph neural networks by rethinking folklore weisfeiler-lehman. In Thirty-seventh Conference on Neural Information Processing Systems, 2023. URL https://openreview.net/forum?id=UlJcZoawgU.
- Chembl: a large-scale bioactivity database for drug discovery. Nucleic acids research, 40(D1):D1100–D1107, 2012.
- Neural message passing for quantum chemistry. In International conference on machine learning, pp. 1263–1272. PMLR, 2017.
- Gpt4graph: Can large language models understand graph structured data ? an empirical evaluation and benchmarking, 2023.
- Inductive representation learning on large graphs. In Advances in Neural Information Processing Systems, pp. 1025–1035, 2017.
- Bernnet: Learning arbitrary graph spectral filters via bernstein approximation. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=WigDnV-_Gq.
- Explanations as features: Llm-based features for text-attributed graphs, 2023.
- Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687, 2020.
- Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430, 2021.
- Prodigy: Enabling in-context learning over graphs. arXiv preprint arXiv:2305.12600, 2023.
- Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations, 2017. URL https://openreview.net/forum?id=SJU4ayYgl.
- Geodesic graph neural network for efficient graph representation learning. In Alice H. Oh, Alekh Agarwal, Danielle Belgrave, and Kyunghyun Cho (eds.), Advances in Neural Information Processing Systems, 2022. URL https://openreview.net/forum?id=6pC5OtP7eBx.
- Graph contrastive learning meets graph meta learning: A unified method for few-shot node tasks. arXiv preprint arXiv:2309.10376, 2023a.
- Pre-train, prompt, and predict: A systematic survey of prompting methods in natural language processing. ACM Computing Surveys, 55(9):1–35, 2023b.
- Graphprompt: Unifying pre-training and downstream tasks for graph neural networks. In Proceedings of the ACM Web Conference 2023, 2023c.
- Pretrained transformers as universal computation engines. ArXiv, abs/2103.05247, 2021. URL https://api.semanticscholar.org/CorpusID:232168936.
- Wiki-cs: A wikipedia-based benchmark for graph neural networks. arXiv preprint arXiv:2007.02901, 2020.
- Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI conference on artificial intelligence, pp. 4602–4609, 2019.
- Can large language models empower molecular property prediction?, 2023.
- Sentence-bert: Sentence embeddings using siamese bert-networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics, 11 2019. URL https://arxiv.org/abs/1908.10084.
- Prompt programming for large language models: Beyond the few-shot paradigm. In Extended Abstracts of the 2021 CHI Conference on Human Factors in Computing Systems, pp. 1–7, 2021.
- Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings, pp. 593–607, Berlin, Heidelberg, 2018. Springer-Verlag. ISBN 978-3-319-93416-7. doi: 10.1007/978-3-319-93417-4_38. URL https://doi.org/10.1007/978-3-319-93417-4_38.
- Pitfalls of graph neural network evaluation. arXiv preprint arXiv:1811.05868, 2018.
- All in one: Multi-task prompting for graph neural networks. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, pp. 2120–2131, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701030. doi: 10.1145/3580305.3599256. URL https://doi.org/10.1145/3580305.3599256.
- Transductive linear probing: A novel framework for few-shot node classification. arXiv preprint arXiv:2212.05606, 2022.
- Virtual node tuning for few-shot node classification. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’23, pp. 2177–2188, New York, NY, USA, 2023. Association for Computing Machinery. ISBN 9798400701030. doi: 10.1145/3580305.3599541. URL https://doi.org/10.1145/3580305.3599541.
- Graphgpt: Graph instruction tuning for large language models, 2023.
- Galactica: A large language model for science. arXiv preprint arXiv:2211.09085, 2022.
- Observed versus latent features for knowledge base and text inference. In Workshop on Continuous Vector Space Models and their Compositionality, 2015. URL https://api.semanticscholar.org/CorpusID:5378837.
- Llama 2: Open foundation and fine-tuned chat models, 2023.
- Graph attention networks. In International Conference on Learning Representations, 2018. URL https://openreview.net/forum?id=rJXMpikCZ.
- Can language models solve graph problems in natural language?, 2023.
- Text embeddings by weakly-supervised contrastive pre-training. arXiv preprint arXiv:2212.03533, 2022a.
- Graph few-shot learning with task-specific structures. Advances in Neural Information Processing Systems, 35:38925–38936, 2022b.
- Task-adaptive few-shot node classification. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 1910–1919, 2022c.
- Finetuned language models are zero-shot learners. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=gEZrGCozdqR.
- Moleculenet: a benchmark for molecular machine learning. Chemical science, 9(2):513–530, 2018.
- How powerful are graph neural networks? In International Conference on Learning Representations, 2018.
- Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning, pp. 40–48. PMLR, 2016.
- Natural language is all a graph needs, 2023.
- Identity-aware graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, pp. 10737–10745, 2021.
- A complete expressiveness hierarchy for subgraph gnns via subgraph weisfeiler-lehman tests, 2023a.
- Link prediction based on graph neural networks. In Advances in Neural Information Processing Systems, pp. 5165–5175, 2018.
- Nested graph neural networks. Advances in Neural Information Processing Systems, 34:15734–15747, 2021.
- Labeling trick: A theory of using graph neural networks for multi-node representation learning. In A. Beygelzimer, Y. Dauphin, P. Liang, and J. Wortman Vaughan (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=Hcr9mgBG6ds.
- Automatic chain of thought prompting in large language models. In The Eleventh International Conference on Learning Representations, 2023b. URL https://openreview.net/forum?id=5NTt8GFjUHkr.
- Gimlet: A unified graph-text model for instruction-based molecule zero-shot learning, 2023a.
- Learning on large-scale text-attributed graphs via variational inference. In The Eleventh International Conference on Learning Representations, 2023b. URL https://openreview.net/forum?id=q0nmYciuuZN.
- Graphtext: Graph reasoning in text space, 2023c.
- Neural bellman-ford networks: A general graph neural network framework for link prediction. Advances in Neural Information Processing Systems, 34, 2021.