Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

GOFA: A Generative One-For-All Model for Joint Graph Language Modeling (2407.09709v1)

Published 12 Jul 2024 in cs.LG and cs.CL

Abstract: Foundation models, such as LLMs or Large Vision Models (LVMs), have emerged as one of the most powerful tools in the respective fields. However, unlike text and image data, graph data do not have a definitive structure, posing great challenges to developing a Graph Foundation Model (GFM). For example, current attempts at designing general graph models either transform graph data into a language format for LLM-based prediction or still train a GNN model with LLM as an assistant. The former can handle unlimited tasks, while the latter captures graph structure much better -- yet, no existing work can achieve both simultaneously. In this paper, we identify three key desirable properties of a GFM: self-supervised pretraining, fluidity in tasks, and graph awareness. To account for these properties, we extend the conventional LLMing to the graph domain and propose a novel generative graph LLM GOFA to solve the problem. The model interleaves randomly initialized GNN layers into a frozen pre-trained LLM so that the semantic and structural modeling abilities are organically combined. GOFA is pre-trained on newly proposed graph-level next-word prediction, question-answering, and structural tasks to obtain the above GFM properties. The pre-trained model is further fine-tuned on downstream tasks to obtain task-solving ability. The fine-tuned model is evaluated on various downstream tasks, demonstrating a strong ability to solve structural and contextual problems in zero-shot scenarios. The code is available at https://github.com/JiaruiFeng/GOFA.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (7)
  1. Lecheng Kong (10 papers)
  2. Jiarui Feng (11 papers)
  3. Hao Liu (497 papers)
  4. Chengsong Huang (11 papers)
  5. Jiaxin Huang (48 papers)
  6. Yixin Chen (126 papers)
  7. Muhan Zhang (89 papers)
Citations (1)
Github Logo Streamline Icon: https://streamlinehq.com