Prompt-based GNNs: Efficient Graph Adaptation

Updated 28 March 2026

Prompt-based GNNs are graph representation learning methods that incorporate minimal, tunable prompt vectors to bridge the gap between pre-training and downstream tasks.
They inject prompts at various stages—input, topology, message-passing—to enable parameter-efficient adaptation and robust performance under low-data and distribution shift scenarios.
Empirical studies show that prompt-based GNNs improve classification and link prediction metrics while tuning less than 1% of model parameters, enhancing scalability and flexibility.

Prompt-based Graph Neural Networks (Prompt-based GNNs) constitute a class of graph representation learning methods that inject tunable prompt vectors or prompt mechanisms into the training or adaptation phase of GNNs, aiming to bridge the semantic and task gap between pre-training and downstream tasks. These approaches are motivated by the success of prompt learning in NLP and vision, but are specifically adapted for the unique characteristics of graph data and GNN architectures, including homophily/heterophily, edge structure, scalability, and parameter efficiency. Prompt-based GNNs encompass a spectrum of mechanisms ranging from input-level feature prompting and topology-oriented structural prompts to message-passing and edge-level adaptation, as well as domain-general frameworks such as prompt-expert mixtures and privacy-preserving prompting.

1. General Principles and Motivation

Prompt-based GNNs are designed to address the limitations inherent in standard “pre-train, fine-tune” regimes for deep graph learning. Specifically, naïve fine-tuning frequently induces negative transfer, especially under distribution shift, low-label, or heterophilous graph settings. Prompt-based methods maintain a frozen pre-trained GNN backbone and introduce a minimal, trainable “prompt” component that steers the model to adapt to new or evolving tasks. Typically, prompts are low-dimensional vectors, matrices, or binary masks, and can interact with node features, graph topology, hidden representations, or message-passing protocols (Fang et al., 2022, Liu et al., 2023, Chen et al., 2023, Huang et al., 2024, Wang et al., 5 Nov 2025).

Distinctively in the graph domain, prompt-based GNNs enable:

Parameter-efficient transfer: Only the prompt module is tuned on limited data (often <1% of total parameters), mitigating overfitting and improving sample efficiency.
Task and domain alignment: Prompts allow adaptation across a heterogeneous suite of downstream tasks or graphs, regardless of the pre-training objective.
Unified or modular architectures: Prompts can serve as a domain-agnostic adapter (universal prompting), be composed with specialized “expert” modules, or serve as learned selectors for task-relevant features (Wang et al., 5 Nov 2025).
Bridging the gap between pre-training and task data distributions—especially important for inductive or dynamic scenarios.

2. Core Prompting Methodologies

Input and Feature-level Prompting

The earliest prompt-based GNNs (e.g., GPF (Fang et al., 2022), GraphPrompt (Liu et al., 2023)) directly manipulate the input node-feature matrix. The generic prompt injection takes the form

$X^* = X + 1_n p^\top$

where $X$ is the node-feature matrix and $p$ is a global trainable prompt vector. Variants such as GPF-plus use attention over a basis of prompt vectors, employing task or node-specific adaptivity,

$x_i^* = x_i + \sum_{j=1}^k \alpha_{i,j} p_j$

where the mixing weights are attention scores over basis prompts (Fang et al., 2022).

Pooling and Readout Prompting

Methods such as GraphPrompt (Liu et al., 2023) define prompts that enter during the pooling/readout operation. Here, a task-specific prompt vector $p_t$ is multiplied elementwise with GNN node embeddings prior to pooling:

$s_{t,x} = \sum_{v\in S_x.V} (p_t \odot h_v)$

This focuses on up- or down-weighting salient feature channels for each downstream task, supporting both node and graph-level adaptation.

Topology- and Edge-level Prompting

Topology-oriented prompting injects prompts as structural modifications to the underlying adjacency matrix. Representative frameworks such as GraphTOP (Fu et al., 25 Oct 2025) cast adaptation as an edge-rewiring optimization over local subgraphs, learning a soft Bernoulli mask per candidate edge:

$\hat s_{ij} = \mathrm{sigmoid}\left(\frac{g_1 - g_2 + \log\frac{p_{ij}}{1-p_{ij}}}{\tau}\right)$

with sparsity and entropy regularizers to enforce determinism and sparsity. Such methods directly manipulate the input graph structure, providing a distinct axis of adaptation compared to feature-level prompting.

Edge-level prompting also appears in selective prompting (GSPF (Jiang et al., 2024)), which introduces basis prompts to both node features and edge attributes, with soft gating to enforce selective or robust adaptation to important subgraphs/nodes/edges.

Message-Passing and Aggregation Prompting

Recent work, such as MAGPrompt (Nguyen et al., 5 Feb 2026), extends prompt injection to the message-passing phase of GNNs. Here, learnable gates and additive prompt vectors are inserted per edge and per aggregation layer:

$\tilde{m}_{ij}^{(l)} = a_{ij}^{(l)} m_{ij}^{(l)} + p^{(l)}$

with $a_{ij}^{(l)}$ a trainable attention gate and $p^{(l)}$ a message-adaptive prompt vector (or mixture of basis prompts). This remodels GNN neighborhood aggregation to explicitly adapt to downstream tasks, even without changing backbone weights.

Structure-aware Mixture-of-Experts Prompting

The GMoPE framework (Wang et al., 5 Nov 2025) integrates prompt-based learning with a Mixture-of-Experts (MoE) scheme. Multiple GNN “experts” receive distinct prompt vectors, and a learned router dynamically routes input graphs/tasks to the most relevant expert(s):

$\hat X_m = [\tilde X \| p_m \mathbf{1}^T]$

A soft orthogonality constraint encourages diverse expert specialization, and only prompt vectors (not GNN weights) are tuned during task adaptation.

3. Adaptation Algorithms and Losses

Most prompt-based GNNs freeze the backbone and optimize prompt parameters (plus a small classifier head) with a task-specific supervised or contrastive objective. Typical losses involve cross-entropy for classification or InfoNCE/contrastive similarity for few-shot and prototypical tasks (Liu et al., 2023, Ge et al., 2023, Yu et al., 2023):

$\mathcal{L}_{\text{prompt}}(p_t) = -\sum_{(x_i, y_i) \in D_t} \log \frac{\exp(\mathrm{sim}(s_{t, x_i}, \tilde{s}_{t, y_i})/\tau)}{\sum_{c\in Y} \exp(\mathrm{sim}(s_{t, x_i}, \tilde{s}_{t,c})/\tau)}$

Prompting objectives are frequently regularized by prompt diversity (e.g., orthogonality constraints), prompt-sparsity (to avoid over-adaptation), or task-alignment losses (e.g., multi-task identification in ULTRA-DP (Chen et al., 2023)).

Unsupervised prompt optimization (e.g., UGPrompt (Baghershahi et al., 22 May 2025)) leverages consistency-based and adversarial domain-alignment regularization to enable label-free adaptation under covariate shift.

4. Specializations: Homophily/Heterophily, Dynamic, Inductive, and Privacy

Heterophily-aware prompting: Frameworks such as ProNoG (Yu et al., 2024) and PSP (Ge et al., 2023) introduce node-specific prompt networks (condition-nets) or dual-view contrastive pre-training to handle mixed-homophily graphs, leveraging similarity-weighted neighborhood pooling and flexible prompt-injection to adapt locally to heterophilous patterns.
Dynamic graphs and streaming adaptation: GraphPro (Yang et al., 2023) proposes temporal and structural prompt mechanisms to incrementally adapt pre-trained recommenders without full retraining, using soft recency-weighted edge normalization and prompt graphs assembled from the most recent behaviors.
Inductive adaptation: IGAP (Yan et al., 2024) introduces spectral-space prompts that align node features and graph spectrum between disjoint pre-training and fine-tuning graphs, bridging both signal and structural gaps.
Few-shot and zero-shot regimes: GMoPE (Wang et al., 5 Nov 2025), GraphPrompt (Liu et al., 2023), and related frameworks achieve high accuracy in $k$ -shot classification due to parameter-efficient adaptation, outperforming full fine-tuning in extreme low-data settings.
Differential privacy: DP-GPL (Xu et al., 13 Mar 2025) demonstrates that prompt-only adaptation can incur significant privacy risk, solvable only by privacy-preserving prompt aggregation (PATE) instead of DP-SGD, maintaining high utility at strict $(\varepsilon,\delta)$ privacy budgets.

5. Empirical Performance and Application Domains

Prompt-based GNNs have been validated across a wide range of graph domains and downstream tasks:

Node classification (Cora, Citeseer, PubMed, ogbn-arxiv, Chameleon, Actor)
Graph classification (ENZYMES, PROTEINS, COX2, BZR, MoleculeNet benchmarks)
Link prediction and recommendation (Amazon, Taobao, Koubei)
Inductive/few-shot transfer (k-shot splits, low-homophily settings)
Real-world dynamic settings: continual recommendation, evolving graphs, molecular property prediction

Key empirical findings include:

GPF and GPF-plus average +1.4%–3.2% over fine-tuning in full- and few-shot settings (Fang et al., 2022).
GMoPE and MAGPrompt surpass full fine-tuning in link prediction, node, and graph classification by 1–3 pp, tuning <1% of total parameters (Wang et al., 5 Nov 2025, Nguyen et al., 5 Feb 2026).
GraphTOP yields 2–4% absolute gains over feature-prompting baselines, especially on Amazon and Flickr (Fu et al., 25 Oct 2025).
Dynamic GraphPro achieves 60× speedup and state-of-the-art streaming recommendation accuracy (Yang et al., 2023).
UGPrompt (unsupervised) outperforms supervised prompting baselines even with 0 labels (Baghershahi et al., 22 May 2025).

A representative summary table from GMoPE's experiments is as follows:

Task/Metric	FT (Full-tune)	GMoPE (Prompt)	Best Baseline
Link Pred. (DGI/AUC)	88.04	88.22	86.63
Node Clf. (ACC)	64.57	64.57	63.95
Graph Clf. (ACC)	66.72	68.14	66.82

6. Limitations, Best Practices, and Research Directions

Prompt-based GNNs represent a principled and increasingly mature approach for foundation graph model adaptation. However, open challenges remain:

Theoretical characterization of prompt-task consistency, prompt expressivity, and the conditions under which prompt-based adaptation outperforms full fine-tuning.
Extension to higher-order motif prompting, dynamic prompt routing, and prompt composition over multilayer or hierarchical GNNs.
Effective prompts for extreme domain shift, large-scale graphs, and complex temporal or relational structures.
Robustness to noisy or adversarial labels in small-data, real-world or privacy-sensitive scenarios.
Automated prompt-search and meta-prompting (multi-task, continual learning) (Wang et al., 5 Nov 2025, Chen et al., 2023).

Recommended best practices include freezing GNN backbones, matching prompt dimension to feature or hidden size, using prompt-basis or mixture models in few-shot and highly variable settings, and always validating prompt initializations (ULTRA-DP (Chen et al., 2023)). Domain and task-specific regularization remains critical for scalability and stability.

7. Impact and Outlook

Prompt-based GNNs have enabled strong advances in sample-efficient adaptation, multi-task transfer, automated prompt search (MASPOB (Hong et al., 3 Mar 2026)), privacy-preserving deployment, and flexible handling of dynamic, heterophilous, and large-scale graphs. The field is evolving rapidly, with new theoretical, algorithmic, and application frontiers emerging across network science, recommendation, molecular property prediction, and real-world graph systems (Fang et al., 2022, Liu et al., 2023, Yu et al., 2024, Yang et al., 2023, Fu et al., 25 Oct 2025).

The prompt-based paradigm, propelled by nearly universal compatibility with pre-training objectives and graph backbones, is poised to become a cornerstone of efficient, scalable, and generalizable graph representation learning practice.