Position: Graph Foundation Models are Already Here (2402.02216v3)
Abstract: Graph Foundation Models (GFMs) are emerging as a significant research topic in the graph domain, aiming to develop graph models trained on extensive and diverse data to enhance their applicability across various tasks and domains. Developing GFMs presents unique challenges over traditional Graph Neural Networks (GNNs), which are typically trained from scratch for specific tasks on particular datasets. The primary challenge in constructing GFMs lies in effectively leveraging vast and diverse graph data to achieve positive transfer. Drawing inspiration from existing foundation models in the CV and NLP domains, we propose a novel perspective for the GFM development by advocating for a ``graph vocabulary'', in which the basic transferable units underlying graphs encode the invariance on graphs. We ground the graph vocabulary construction from essential aspects including network analysis, expressiveness, and stability. Such a vocabulary perspective can potentially advance the future GFM design in line with the neural scaling laws. All relevant resources with GFM design can be found here.
- Link prediction via higher-order motif features. In Machine Learning and Knowledge Discovery in Databases: European Conference, ECML PKDD 2019, Würzburg, Germany, September 16–20, 2019, Proceedings, Part I, pp. 412–429. Springer, 2020.
- Friends and neighbors on the web. Social networks, 25(3):211–230, 2003.
- Mixed membership stochastic blockmodels. Advances in neural information processing systems, 21, 2008.
- Statistical mechanics of complex networks. Reviews of modern physics, 74(1):47, 2002.
- Sequential modeling enables scalable learning for large vision models. arXiv preprint arXiv:2312.00785, 2023.
- Weisfeiler and leman go relational. In Learning on Graphs Conference, pp. 46–1. PMLR, 2022.
- The logical expressiveness of graph neural networks. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=r1lZ7AEKvB.
- A foundation model for atomistic materials chemistry, 2023.
- Networks beyond pairwise interactions: Structure and dynamics. Physics Reports, 874:1–92, 2020.
- Towards foundational models for molecular learning on large-scale multi-task datasets. arXiv preprint arXiv:2310.04292, 2023.
- Higher-order organization of complex networks. Science, 353(6295):163–166, 2016.
- On the opportunities and risks of foundation models. arXiv preprint arXiv:2108.07258, 2021.
- Network structure inference, a survey: Motivations, methods, and applications. ACM Computing Surveys (CSUR), 51(2):1–39, 2018.
- Sparks of artificial general intelligence: Early experiments with gpt-4. arXiv preprint arXiv:2303.12712, 2023.
- A note on over-smoothing for graph neural networks. arXiv preprint arXiv:2006.13318, 2020.
- When to pre-train graph neural networks? an answer from data generation perspective! arXiv preprint arXiv:2303.16458, 2023.
- Graphllm: Boosting graph reasoning ability of large language model. arXiv preprint arXiv:2310.05845, 2023.
- Graph neural networks for link prediction with subgraph sketching. arXiv preprint arXiv:2209.15486, 2022.
- Learning from labeled and unlabeled data: An empirical study across techniques and domains. Journal of Artificial Intelligence Research, 23:331–366, 2005.
- Uncovering neural scaling laws in molecular representation learning. In Thirty-seventh Conference on Neural Information Processing Systems Datasets and Benchmarks Track, 2023a. URL https://openreview.net/forum?id=Ys8RmfF9w1.
- On representing linear programs by graph neural networks. In The Eleventh International Conference on Learning Representations, 2022.
- Exploring the potential of large language models (llms) in learning on graphs. ArXiv, abs/2307.03393, 2023b.
- Label-free node classification on graphs with large language models (llms). arXiv preprint arXiv:2310.04668, 2023c.
- Adaptive universal generalized pagerank graph neural network. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=n6jl7fLxrP.
- Cs-kg: A large-scale knowledge graph of research entities and claims in computer science. In International Workshop on the Semantic Web, 2022. URL https://api.semanticscholar.org/CorpusID:253021556.
- How does over-squashing affect the power of gnns? arXiv preprint arXiv:2306.03589, 2023.
- A survey for in-context learning. arXiv preprint arXiv:2301.00234, 2022.
- Structural diversity and homophily: A study across more than one hundred big networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 807–816, 2017.
- Faith and fate: Limits of transformers on compositionality. arXiv preprint arXiv:2305.18654, 2023.
- Matching, euler tours and the chinese postman. Mathematical Programming, 5:88–124, 1973. URL https://api.semanticscholar.org/CorpusID:15249924.
- Talk like a graph: Encoding graphs for large language models. arXiv preprint arXiv:2310.04560, 2023.
- Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
- Gnnautoscale: Scalable and expressive graph neural networks via historical embeddings. In International conference on machine learning, pp. 3294–3304. PMLR, 2021.
- Malnet: A large-scale image database of malicious software. arXiv preprint arXiv:2102.01072, 2021.
- Towards foundation models for knowledge graph reasoning. arXiv preprint arXiv:2310.04562, 2023.
- Double equivariance for inductive link prediction for both new nodes and new relation types. In NeurIPS 2023 Workshop: New Frontiers in Graph Learning, 2023.
- Grafenne: learning on graphs with heterogeneous and dynamic feature sets. In International Conference on Machine Learning, pp. 12165–12181. PMLR, 2023.
- Pre-trained models: Past, present and future. AI Open, 2:225–250, 2021.
- Contrastive multi-view representation learning on graphs. In Proceedings of International Conference on Machine Learning, pp. 3451–3461. 2020.
- Harnessing explanations: Llm-to-lm interpreter for enhanced text-attributed graph representation learning, 2023.
- Joint subgraph-to-subgraph transitions: Generalizing triadic closure for powerful and interpretable graph modeling. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining, pp. 815–823, 2021.
- Graphmae: Self-supervised masked graph autoencoders. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp. 594–604, 2022.
- Strategies for pre-training graph neural networks. arXiv preprint arXiv:1905.12265, 2019.
- Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
- Triadic closure pattern analysis and prediction in social networks. IEEE Transactions on Knowledge and Data Engineering, 27(12):3374–3389, 2015.
- Prodigy: Enabling in-context learning over graphs. arXiv preprint arXiv:2305.12600, 2023a.
- Temporal graph benchmark for machine learning on temporal graphs. arXiv preprint arXiv:2307.01026, 2023b.
- A theory of link prediction via relational weisfeiler-leman. arXiv preprint arXiv:2302.02209, 2023c.
- On the stability of expressive positional encodings for graph neural networks. arXiv preprint arXiv:2310.02579, 2023d.
- Simrank: a measure of structural-context similarity. In Proceedings of the eighth ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 538–543, 2002.
- Large language models on graphs: A comprehensive survey. arXiv preprint arXiv:2312.02783, 2023a.
- Patton: Language model pretraining on text-rich networks. arXiv preprint arXiv:2305.12268, 2023b.
- Hierarchical generation of molecular graphs using structural motifs. In International conference on machine learning, pp. 4839–4848. PMLR, 2020a.
- Self-supervised learning on graphs: Deep insights and new direction. arXiv preprint arXiv:2006.10141, 2020b.
- Deep graph reprogramming. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 24345–24354, 2023.
- Score-based generative modeling of graphs via the system of stochastic differential equations. In International Conference on Machine Learning, pp. 10362–10383. PMLR, 2022.
- Multi-task self-supervised graph neural networks enable stronger task generalization. 2023.
- Scaling laws for neural language models. arXiv preprint arXiv:2001.08361, 2020.
- Katz, L. A new status index derived from sociometric analysis. Psychometrika, 18(1):39–43, 1953.
- The homophily principle in social network analysis. arXiv preprint arXiv:2008.10383, 2020.
- Pure transformers are powerful graph learners. arXiv, abs/2207.02505, 2022. URL https://arxiv.org/abs/2207.02505.
- Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016.
- Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems, 34:21618–21629, 2021.
- A survey on graph kernels. Applied Network Science, 5(1):1–42, 2020.
- SNAP Datasets: Stanford large network dataset collection. http://snap.stanford.edu/data, June 2014.
- Kronecker graphs: an approach to modeling networks. Journal of Machine Learning Research, 11(2), 2010.
- Are graph neural networks really helpful for knowledge graph completion? arXiv preprint arXiv:2205.10652, 2022.
- Evaluating graph neural networks for link prediction: Current pitfalls and new benchmarking. arXiv preprint arXiv:2306.10453, 2023a.
- A survey of graph meets large language model: Progress and future directions. arXiv preprint arXiv:2311.12399, 2023b.
- Graphcleaner: Detecting mislabelled samples in popular graph learning benchmarks. arXiv preprint arXiv:2306.00015, 2023c.
- Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. Advances in Neural Information Processing Systems, 34:20887–20902, 2021.
- Sign and basis invariant networks for spectral graph representation learning. In The Eleventh International Conference on Learning Representations, 2022.
- Data-centric learning from unlabeled graphs with diffusion model. arXiv preprint arXiv:2303.10108, 2023a.
- One for all: Towards training one graph model for all classification tasks. arXiv preprint arXiv:2310.00149, 2023b.
- Visual instruction tuning. In NeurIPS, 2023c.
- Simga: A simple and effective heterophilous graph neural network with efficient global aggregation. arXiv preprint arXiv:2305.09958, 2023d.
- Towards graph foundation models: A survey and beyond. arXiv preprint arXiv:2310.11829, 2023e.
- Neural scaling laws on graphs. 2024.
- Revisiting graph contrastive learning from the perspective of graph spectrum. Advances in Neural Information Processing Systems, 35:2972–2983, 2022.
- Rethinking tokenizer and decoder in masked graph modeling for molecules. In NeurIPS, 2023f. URL https://openreview.net/forum?id=fWLf8DV0fI.
- Graphprompt: Unifying pre-training and downstream tasks for graph neural networks. In Proceedings of the ACM Web Conference 2023, pp. 417–428, 2023g.
- Highly accurate quantum chemical property prediction with uni-mol+. arXiv preprint arXiv:2303.16982, 2023.
- Is heterophily a real nightmare for graph neural networks to do node classification? arXiv preprint arXiv:2109.05641, 2021.
- When do graph neural networks help with node classification: Investigating the homophily principle on node distinguishability. arXiv preprint arXiv:2304.14274, 2023.
- Graphdf: A discrete flow model for molecular graph generation. In International Conference on Machine Learning, pp. 7192–7203. PMLR, 2021.
- Is homophily a necessity for graph neural networks? arXiv preprint arXiv:2106.06134, 2021.
- Demystifying structural disparity in graph neural networks: Can one size fit all? arXiv preprint arXiv:2306.01323, 2023a.
- Revisiting link prediction: A data perspective. arXiv preprint arXiv:2310.00793, 2023b.
- Gps++: An optimised hybrid mpnn/transformer for molecular property prediction. arXiv preprint arXiv:2212.02229, 2022.
- Embers of autoregression: Understanding large language models through the problem they are trained to solve. arXiv preprint arXiv:2309.13638, 2023.
- A First Course in Network Science. Cambridge University Press, 2020.
- Network motifs: simple building blocks of complex networks. Science, 298(5594):824–827, 2002.
- Task2sim: Towards effective pre-training and transfer from synthetic data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 9194–9204, 2022.
- Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pp. 4602–4609, 2019.
- Tudataset: A collection of benchmark datasets for learning with graphs. In ICML 2020 Workshop on Graph Representation Learning and Beyond (GRL+ 2020), 2020. URL www.graphlearning.io.
- Weisfeiler and leman go machine learning: The story so far. Journal of Machine Learning Research, 24(333):1–59, 2023. URL http://jmlr.org/papers/v24/22-0240.html.
- Attending to graph transformers. arXiv preprint arXiv:2302.04181, 2023.
- Structural transition in social networks. 2019.
- Graph neural networks exponentially lose expressive power for node classification. In International Conference on Learning Representations, 2019.
- Project, U. C. R. Recommender systems and personalization datasets. URL https://cseweb.ucsd.edu/~jmcauley/datasets.html.
- Learning transferable visual models from natural language supervision. In International conference on machine learning, pp. 8748–8763. PMLR, 2021.
- Exploring the limits of transfer learning with a unified text-to-text transformer. arXiv e-prints, 2019.
- Deep Learning for the Life Sciences. O’Reilly Media, 2019. https://www.amazon.com/Deep-Learning-Life-Sciences-Microscopy/dp/1492039837.
- An introduction to exponential random graph (p*) models for social networks. Social networks, 29(2):173–191, 2007.
- The network data repository with interactive graph analytics and visualization. In AAAI, 2015. URL http://networkrepository.com.
- Transferability properties of graph neural networks. IEEE Transactions on Signal Processing, 71:3474–3489, 2023. doi: 10.1109/TSP.2023.3297848.
- Language models are greedy reasoners: A systematic formal analysis of chain-of-thought. arXiv preprint arXiv:2210.01240, 2022.
- Modeling relational data with graph convolutional networks. In The Semantic Web: 15th International Conference, ESWC 2018, Heraklion, Crete, Greece, June 3–7, 2018, Proceedings 15, pp. 593–607. Springer, 2018.
- Learning symbolic models for graph-structured physical mechanism. In The Eleventh International Conference on Learning Representations, 2022.
- From molecules to materials: Pre-training large generalizable models for atomic property prediction, 2023.
- On the equivalence between positional node embeddings and structural graph representations. arXiv preprint arXiv:1910.00452, 2019.
- Infograph: Unsupervised and semi-supervised graph-level representation learning via mutual information maximization. arXiv preprint arXiv:1908.01000, 2019.
- Gppt: Graph pre-training and prompt tuning to generalize graph neural networks. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22, pp. 1717–1727, New York, NY, USA, 2022. Association for Computing Machinery. ISBN 9781450393850. doi: 10.1145/3534678.3539249. URL https://doi.org/10.1145/3534678.3539249.
- All in one: Multi-task prompting for graph neural networks. 2023.
- Graph convolutional networks for graphs containing missing features. Future Generation Computer Systems, 117:155–168, 2021.
- Tang, J. Aminer: Toward understanding big scholar data. In Proceedings of the Ninth ACM International Conference on Web Search and Data Mining, WSDM ’16, pp. 467, New York, NY, USA, 2016. Association for Computing Machinery. ISBN 9781450337168. doi: 10.1145/2835776.2835849. URL https://doi.org/10.1145/2835776.2835849.
- Graphgpt: Graph instruction tuning for large language models. arXiv preprint arXiv:2310.13023, 2023.
- Understanding over-squashing and bottlenecks on graphs via curvature. In International Conference on Learning Representations, 2021.
- Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288, 2023.
- Solving olympiad geometry without human demonstrations. Nature, 625(7995):476–482, 2024.
- Confidence-based feature imputation for graphs with partially known features. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=YPKBIILy-Kt.
- Composition-based multi-relational graph convolutional networks. In International Conference on Learning Representations, 2019.
- Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
- Deep graph infomax. arXiv preprint arXiv:1809.10341, 2018.
- Digress: Discrete denoising diffusion for graph generation. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=UaAD-Nu86WX.
- Graph kernels. Journal of Machine Learning Research, 11:1201–1242, 2010.
- Equivariant and stable positional encoding for more powerful graph neural networks. In International Conference on Learning Representations, 2021.
- Can language models solve graph problems in natural language? In Thirty-seventh Conference on Neural Information Processing Systems, 2023a. URL https://openreview.net/forum?id=UDqHhbqYJV.
- Understanding heterophily for graph neural networks. arXiv preprint arXiv:2401.09125, 2024.
- Neural common neighbor with completion for link prediction. arXiv preprint arXiv:2302.00890, 2023b.
- Nodeformer: A scalable graph structure learning transformer for node classification. Advances in Neural Information Processing Systems, 35:27387–27401, 2022.
- Demystifying oversmoothing in attention-based graph neural networks. arXiv preprint arXiv:2305.16102, 2023.
- Mole-BERT: Rethinking pre-training graph neural networks for molecules. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=jevY-DtiZTR.
- Graph-aware language model pre-training on a large graph corpus can help multiple graph applications. arXiv preprint arXiv:2306.02592, 2023.
- Better with less: A data-active perspective on pre-training graph neural networks. arXiv preprint arXiv:2311.01038, 2023.
- How powerful are graph neural networks? In International Conference on Learning Representations, 2019. URL https://openreview.net/forum?id=ryGs6iA5Km.
- Graph neural networks inspired by classical iterative algorithms. In International Conference on Machine Learning, pp. 11773–11783. PMLR, 2021.
- Deep bidirectional language-knowledge graph pretraining. In Neural Information Processing Systems (NeurIPS), 2022a.
- Linkbert: Pretraining language models with document links. arXiv preprint arXiv:2203.15827, 2022b.
- Generative knowledge graph construction: A review. arXiv preprint arXiv:2210.12714, 2022.
- Natural language is all a graph needs. arXiv:2308.07134, 2023.
- Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pp. 974–983, 2018.
- Graph contrastive learning with augmentations. Advances in neural information processing systems, 33:5812–5823, 2020.
- Graph domain adaptation via theory-grounded spectral regularization. In The Eleventh International Conference on Learning Representations, 2023. URL https://openreview.net/forum?id=OysfLgrk8mk.
- Language model beats diffusion–tokenizer is key to visual generation. arXiv preprint arXiv:2310.05737, 2023a.
- Language model beats diffusion–tokenizer is key to visual generation. arXiv preprint arXiv:2310.05737, 2023b.
- Llamarec: Two-stage recommendation using large language models for ranking. arXiv preprint arXiv:2311.02089, 2023.
- Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931, 2019.
- Decoupling the depth and scope of graph neural networks. In Beygelzimer, A., Dauphin, Y., Liang, P., and Vaughan, J. W. (eds.), Advances in Neural Information Processing Systems, 2021. URL https://openreview.net/forum?id=d0MtHWY0NZ.
- Commonscenes: Generating commonsense 3d indoor scenes with scene graphs. arXiv preprint arXiv:2305.16283, 2023.
- Beyond weisfeiler-lehman: A quantitative framework for gnn expressiveness. arXiv preprint arXiv:2401.08514, 2024.
- Dpa-2: Towards a universal large atomic model for molecular and material simulation. arXiv preprint arXiv:2312.15492, 2023a.
- Oag: Linking entities across large-scale heterogeneous knowledge graphs. IEEE Transactions on Knowledge and Data Engineering, 35(9):9225–9239, 2023b. doi: 10.1109/TKDE.2022.3222168.
- Link prediction based on graph neural networks. Advances in neural information processing systems, 31, 2018.
- Labeling trick: A theory of using graph neural networks for multi-node representation learning. Advances in Neural Information Processing Systems, 34:9061–9073, 2021.
- Graph meets llms: Towards large graph models. In NeurIPS 2023 Workshop: New Frontiers in Graph Learning, 2023c.
- Live graph lab: Towards open, dynamic and real transaction graphs with nft. arXiv preprint arXiv:2310.11709, 2023d.
- Gimlet: A unified graph-text model for instruction-based molecule zero-shot learning. bioRxiv, pp. 2023–05, 2023a.
- Graphtext: Graph reasoning in text space. arXiv preprint arXiv:2310.01089, 2023b.
- Graphgpt: Graph learning with generative pre-trained transformers. arXiv preprint arXiv:2401.00529, 2023c.
- Towards predicting equilibrium distributions for molecular systems with deep learning. arXiv preprint arXiv:2306.05445, 2023a.
- You only transfer what you share: Intersection-induced graph transfer learning for link prediction. arXiv preprint arXiv:2302.14189, 2023b.
- Learning to generate scene graph from natural language supervision. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 1823–1834, 2021.
- An empirical study of graph contrastive learning. In Thirty-fifth Conference on Neural Information Processing Systems Datasets and Benchmarks Track (Round 2), 2021a.
- Neural bellman-ford networks: A general graph neural network framework for link prediction. Advances in Neural Information Processing Systems, 34:29476–29490, 2021b.