G2T-FM: Graph Model for Tabular Features

Updated 29 August 2025

G2T-FM is a graph foundation model that augments node representations with structural, neighborhood, and learnable encodings for universal applicability.
It employs Neighborhood Feature Aggregation, classic structure-based features, and PEARL to fuse local and global graph information effectively.
Empirical evaluations show that G2T-FM outperforms traditional GFMs in both in-context and finetuning regimes across diverse datasets.

G2T-FM is a graph foundation model built to leverage tabular foundation models—specifically TabPFNv2—for graph machine learning tasks involving arbitrary and heterogeneous node features. Contrary to prior graph foundation models (GFMs) that have focused on text-attributed graphs, G2T-FM is designed for universal applicability, including non-textual (tabular) node features, by augmenting node representations with structural, neighborhood, and learnable encodings prior to tabular processing.

1. Model Architecture and Feature Augmentation

The architecture centers on transforming each node’s representation into a form suitable for TabPFNv2 by concatenating the node’s original features with graph-derived augmentations. The process is defined as:

Neighborhood Feature Aggregation (NFA): For node $v$ , aggregate neighborhood features using mean, maximum, and minimum statistics (numerical features), and average one-hot encodings (categorical features):

$\mu(v) = \frac{1}{|N(v)|} \sum_{i \in N(v)} x_i$

where $N(v)$ is the set of neighbors and $x_i$ is the feature of neighbor $i$ .

Classic Structure-Based Features (SF): Compute degree, PageRank, and first $K$ Laplacian eigenvectors for each node. Laplacian eigenvectors $(v_1, \ldots, v_K)$ are extracted from the graph Laplacian $L = D - A$ to encode positional information.
Learnable Structure-Based Encodings (PEARL): Each node receives a random initialization that is processed by a small GNN multiple times, averaging results to provide powerful, permutation-equivariant encodings.

The final node representation for each node $v$ is:

$z(v) = [x(v); NFA(v); SF(v); PEARL(v)]$

where $x(v)$ are the original node features. This composite is input to TabPFNv2, which applies transformer-like processing, random positional encodings, and attention, preserving invariance to feature order and label permutation through mechanisms such as label shuffling.

2. Handling Arbitrary Node Features

G2T-FM is agnostic to the type of node features (numerical, categorical, or mixed). Unlike methods that require text-based features or transformations, G2T-FM:

Accepts original node features in their native format.
Uses feature aggregation to encode local topology, capturing nuanced relationships in the neighborhood.
Integrates classic graph descriptors for a richer global context.
Utilizes PEARL to ensure expressive, permutation-equivariant structural information even in graphs lacking strong symmetry-breaking attributes.

This ensures G2T-FM can process a wide array of graph datasets, including those from domains (e.g., city analytics, crowdsourcing, art networks) where node attributes are not naturally textual.

3. Empirical Performance

Extensive evaluations demonstrate G2T-FM's efficacy:

In-Context Regime: When the full training set is supplied as prompt (without parameter updates), G2T-FM outperforms leading public GFMs (AnyGraph, OpenGraph, TS-GNN) and matches well-tuned classical GNNs (GCN, GraphSAGE, GAT, Graph Transformer), measured using average precision (binary tasks), accuracy (multiclass), and $R^2$ (regression).
Finetuning: Upon gradient-based optimization of both the TabPFNv2 backbone and the PEARL module, G2T-FM surpasses classic GNNs trained from scratch. Preprocessing steps such as PCA may be applied to accommodate high-dimensional features.
Dataset Coverage: Strong results are reported on datasets with diverse feature types, such as tolokers-2 (crowdsourcing), city-reviews (urban analytics), and artnet-views (art network analysis), indicating broad applicability.

4. Learning Paradigms: In-Context vs. Finetuned

G2T-FM supports two usage paradigms:

In-Context Learning (ICL): The model is conditioned on labeled training data and evaluated on test data, with no gradient updates. This regime yields robust performance, showing the power of tabular backbones with graph augmentations even absent fine-scale parameter adaptation.
Finetuning: Model parameters are updated using the downstream task’s training data. Substantive performance improvements are observed, with task-specific adaptation allowing G2T-FM to outperform strong baselines across tasks. The approach is robust to varying feature spaces and graph topologies.

5. Implications for Graph Foundation Modeling

The utilization of TabPFNv2 reveals a previously overlooked direction in graph representation learning:

Generalization: G2T-FM is capable of processing arbitrary node feature spaces without the limiting assumptions of text-form features or restricted modalities.
Unification: The methodology unifies tabular and graph modalities, reflecting the similar challenges in arbitrary feature handling, and enables cross-domain advancements through tabular-model-derived techniques.
Competitive Edge: G2T-FM demonstrates that tabular modeling, when paired with critical graph augmentations, can yield competitive and superior performance to specialized GNNs—even in “foundation” scenarios with extremely varied data.
Research Pathways: The presented architecture encourages further augmentation—such as multi-hop neighborhood encoding, dynamic neighborhood interactions, or cross-graph pretraining—supporting expansion into node regression, fraud detection, and other domains.

6. Summary Table: Core Components and Roles

Component	Description	Role in G2T-FM
NFA	Neighbor statistics (mean, max, min)	Local context, topology
SF	Degree, PageRank, Laplacian eigvecs	Global and relative position
PEARL	Random GNN-based encodings	Symmetry breaking, equivariance
TabPFNv2	Transformer tabular backbone	Feature/structure processing

The confluence of tabular feature processing, contextually rich graph augmentations, and robust learning paradigms establishes G2T-FM as a generic, high-performing graph foundation model suitable for diverse, real-world tasks. This approach signals a broader shift in graph ML, demonstrating the viability and strengths of adapting tabular foundation models for graph-centric challenges (Eremeev et al., 28 Aug 2025).

PDF Markdown Chat (Pro)

References (1)

Turning Tabular Foundation Models into Graph Foundation Models (2025)

Whiteboard

Generate a whiteboard explanation of this topic.

Follow Topic

Get notified by email when new papers are published related to G2T-FM.