Message-Passing Graph Neural Networks

Updated 23 March 2026

Message-Passing Graph Neural Networks are neural architectures that iteratively update node representations by exchanging and aggregating messages along graph edges.
They leverage diverse techniques such as attention-based weighting, multi-hop aggregation, and hierarchical structures to enhance expressivity and scalability.
Their flexible design supports applications across signal processing, traffic forecasting, MIMO detection, and fairness-aware learning in various domains.

A Message-Passing Graph Neural Network (GNN) is a neural architecture for graph-structured data in which node (or edge) representations are iteratively updated via the exchange and aggregation of messages along the edges of a graph. The message-passing formalism subsumes many widely-used GNN variants and provides a unified theoretical and algorithmic framework for discrete structure learning, signal processing, and relational inference on arbitrary graphs.

1. Mathematical Definition and Core Principles

Let $G = (V, E)$ be a graph with node features $\{x_i\}_{i \in V}$ , edge features $\{e_{ij}\}_{(i,j)\in E}$ , and possibly global features. An L-layer message-passing GNN constructs latent representations $h_i^{(0)}, \dots, h_i^{(L)}$ for every node $i$ , with the canonical layerwise update:

$\begin{aligned} \text{Message:} \quad & m_{ij}^{(l)} = M^{(l)}(h_i^{(l)}, h_j^{(l)}, e_{ij}), \ \text{Aggregation:} \quad & a_i^{(l)} = \mathrm{AGG}^{(l)}(\{m_{ij}^{(l)} \mid j \in \mathcal{N}(i)\}), \ \text{Update:} \quad & h_i^{(l+1)} = U^{(l)}(h_i^{(l)}, a_i^{(l)}), \end{aligned}$

where $M$ is the message function (often a neural network), AGG is a permutation-invariant aggregator (sum, mean, max), and $U$ is the local node update (typically an MLP or nonlinear transformation) (Vasileiou et al., 10 Feb 2026).

The output embedding at each node or graph is typically read out via a task-specific head. This abstraction yields a class of permutation-equivariant operators over graph signals, ensuring that permuting node indices permutes the outputs correspondingly.

2. Architectural and Theoretical Variants

The message-passing framework supports several design axes and generalizations:

1-sided vs. 2-sided messages: In 1-sided schemes, messages depend only on the sender, while in 2-sided (targeted/anisotropic) settings, messages depend on both sender and receiver. Complete equivalence in non-uniform expressivity holds over undirected graphs, but uniform 2-sided message passing with sum aggregation is strictly more expressive than 1-sided (Grohe et al., 2024).
Weighted and attention-based passing: Messages can be weighted explicitly via attention or learned coefficients. Edge-variant approaches (e.g., GGNN) permit learned or parametrized edge weights, which can be normalized and/or gated (e.g., via sigmoids) (Raghuvanshi et al., 2023).
Higher-order/hop generalization: Instead of strictly 1-hop neighbors, K-hop GNNs permit simultaneous aggregation from nodes up to K hops away. These architectures have strictly greater expressive power than 1-hop MPNNs but are bounded above by the 3-dimensional Weisfeiler–Leman (3-WL) test; further augmentation with peripheral subgraph encoding (KP-GNN) can break this limitation on certain graph classes (Feng et al., 2022).
Hierarchical extension: Flat message-passing stacks can be augmented with hierarchical super-graphs (built via community detection), allowing inter-level propagation to capture long-range dependencies in logarithmic rather than linear number of layers relative to the graph diameter (Zhong et al., 2020).
Spectral message passing: MPNNs and spectral GNNs are two parametrizations of the same class of permutation-equivariant operators. Spectral approaches perform message passing in the space of Laplacian eigenvectors, enabling efficient capture of global graph properties and smoothing, while spatial/message-passing views enable sharp analysis of discrete structure and local relationships (Vasileiou et al., 10 Feb 2026, Stachenfeld et al., 2020).

3. Expressive Power and Theoretical Guarantees

Weisfeiler–Leman alignment: Standard MPNNs with injective aggregator and sufficiently expressive update functions have exactly the expressiveness of the 1-dimensional Weisfeiler–Leman (1-WL) test (Vasileiou et al., 10 Feb 2026). K-hop and enhanced schemes can surpass 1-WL for distinguishing regular and distance-regular graphs, with provable results quantifying necessary hop distance for separation (Feng et al., 2022).
Uniform vs. non-uniform expressivity: Targeted (2-sided) message passing is only more powerful than 1-sided in the uniform regime with sum aggregation. For typical settings with mean or max aggregation or non-uniform model selection, both variants are equally expressive (Grohe et al., 2024).
Convexification: Convexified message-passing GNNs (CGNNs) recast the non-convex training of GNNs as a convex empirical risk minimization in a reproducing kernel Hilbert space. This delivers globally optimal training, theoretical generalization rates of $O(m^{-1/2})$ , and empirical performance greatly exceeding standard deep GNNs, even in shallow models (Cohen et al., 23 May 2025).

4. Practical Enhancements, Efficiency, and Scalability

Residual and weighted connections: Integrating residual shortcuts (à la ResNet) and learnable/gated edge weights sharpens convergence, mitigates vanishing gradients, improves representational capacity, and reduces oversmoothing. These modifications demonstrably improve classification accuracy and training speed on citation and semi-supervised benchmarks (Raghuvanshi et al., 2023).
Memory-based and persistent mechanisms: Memory-based message passing (MMP) decouples propagation (memory cell) from discrimination (self-embedding), with adaptive gating and decoupling regularization. This yields robustness to heterophily and network noise, and can be used as a plug-in layer within existing architectures (Chen et al., 2022). Persistent message passing (PMP) replaces overwriting with explicit persistence of all past hidden states, supporting exact historical queries and strong out-of-distribution generalization in dynamical data-structure tasks (Strathmann et al., 2021).
Efficient minibatch training: Accurate and scalable message-passing on large graphs is achieved via the message-invariance principle: if out-of-batch messages can be reconstructed by an invariant transformation of in-batch embeddings, the cost of cross-batch messaging collapses to efficient local computation with compensatory correction (TOP algorithm). This enables order-of-magnitude speedup with minimal loss of accuracy on graphs up to 100M nodes and billions of edges (Shi et al., 27 Feb 2025).
Adaptive depth and dynamic routing: ADMP-GNN architectures enable per-node adaptive message-passing depth, with multi-exit points and heuristic policies based on node centrality. This adapts the layer count to local graph structure, mitigating oversmoothing or underreaching without increasing parameter count, and improves accuracy over fixed-depth baselines in node classification (Abbahaddou et al., 1 Sep 2025). Dynamic message-passing (N²) constructs flexible, spatially-evolving pathways between nodes and pseudo-nodes in a latent space, scaling global communication to large graphs with linear complexity and high accuracy (Sun et al., 2024).
Automated architecture search and operation selection: Differentiable neural architecture search (NAS) leveraging a fine-grained search space over message-passing primitives (aggregation and filtering) discovers optimal layer-wise operations and depth, regularly outperforming hand-crafted message-passing architectures across regression, classification, and pattern recognition benchmarks (Cai et al., 2021).

5. Application Domains and Specialized Protocols

Domain-adaptive and fairness-aware message passing: DM-GNN leverages dual feature extraction, class-label-aware propagation, and conditional adversarial adaptation for cross-network node classification. This combination produces discriminative and transferable embeddings robust to distribution shift (Shen et al., 2023). Fairness-aware GMMD integrates a maximum mean discrepancy (MMD)–based regularizer directly into the message-passing update, provably reducing demographic parity difference and yielding state-of-the-art fairness–accuracy tradeoffs across social network benchmarks (Zhu et al., 2023).
Physics and symmetry-invariant message passing: Energy-weighted message passing protocols satisfying rigorous symmetries (e.g., infra-red and collinear safety in QCD jet classification) improve explainability, robustness, and performance in high-energy physics applications. Energy-normalized aggregation and structural invariance to physical splitting processes yield architectural guarantees not available with traditional GNNs (Konar et al., 2021).
Signal processing, communications, and MIMO detection: Custom message-passing GNNs, such as AMP-GNN, unfold traditional probabilistic inference algorithms (approximate message passing) into neural networks, inserting GNN blocks to refine beliefs and leveraging permutation equivariance. In large MIMO systems, this improves detection performance and robustness while maintaining linear scaling with system size (He et al., 2023).
Foundation model-informed message passing: FIMP integrates pretrained transformer attention modules as message-passing operators. Cross-node, token-level attention re-uses Q/K/V projections from foundation models (e.g., ViT, scGPT) and yields state-of-the-art or competitive results in bioinformatics, brain imaging, and vision tasks. Fine-tuned attention yields performance gains even in zero-shot or limited-label settings (Rizvi et al., 2022).
Spatiotemporal and relational forecasting: Pure message-passing architectures embedded in spatiotemporal deep stacks (e.g., Graph-WaveNet for traffic prediction) explicitly model pairwise node–node interactions, resulting in lower forecasting error than convolutional or attention-based spatial modules, especially on predictive tasks dependent on joint node behavior (Prabowo et al., 2023).

6. Spectral–Spatial Unification and Theoretical Insights

Recent work emphasizes the formal equivalence, under mild assumptions, between message-passing (spatial) and spectral GNNs. Both can be seen as approximators of permutation-equivariant operators. Spectral GNNs endow the message-passing paradigm with principled tools for low-pass filtering, bottleneck estimation, stability analysis, and community detection, while the spatial/message-passing language facilitates logical, algorithmic, and relational expressivity (Vasileiou et al., 10 Feb 2026). Hybrid models that interleave spatial and spectral message passing can converge rapidly, are robust to missing edges, and leverage both local and global structures (Stachenfeld et al., 2020).

7. Benchmarks, Empirical Performance, and Open Problems

Across classification, regression, and structural tasks, message-passing GNNs and their enhanced/variant forms achieve state-of-the-art accuracy, robustness under noise or sparsity, and superior resource efficiency. Convexified and fair GNNs demonstrate new optimality or fairness guarantees (Cohen et al., 23 May 2025, Zhu et al., 2023), dynamic and scalable message-passing protocols enable handling of industrial-scale graphs (Shi et al., 27 Feb 2025, Sun et al., 2024), and domain- and physics-aware message schemes ensure invariance and interpretability in specialized applications (Konar et al., 2021, He et al., 2023).

Current research directions include further generalization of message-passing schedules (adaptive architectures, dynamic routing), unification with attention-based and spectral models, exploration of new expressivity regimes beyond the Weisfeiler–Leman hierarchy, and integration with large pretrained models. Theoretical questions remain regarding the limits of message-invariance, efficient compensation schemes, and the trade-offs between model depth, permutation equivariance, and scalability.

References:

(Vasileiou et al., 10 Feb 2026) Position: Message-passing and spectral GNNs are two sides of the same coin
(Cohen et al., 23 May 2025) Convexified Message-Passing Graph Neural Networks
(Grohe et al., 2024) Are Targeted Messages More Effective?
(Raghuvanshi et al., 2023) GGNNs : Generalizing GNNs using Residual Connections and Weighted Message Passing
(Zhu et al., 2023) Fairness-aware Message Passing for Graph Neural Networks
(Rizvi et al., 2022) FIMP: Foundation Model-Informed Message Passing for Graph Neural Networks
(Feng et al., 2022) How Powerful are K-hop Message Passing Graph Neural Networks
(Chen et al., 2022) Memory-based Message Passing: Decoupling the Message for Propogation from Discrimination
(Strathmann et al., 2021) Persistent Message Passing
(Cai et al., 2021) Rethinking Graph Neural Architecture Search from Message-passing
(Konar et al., 2021) Energy-weighted Message Passing: an infra-red and collinear safe graph neural network algorithm
(Stachenfeld et al., 2020) Graph Networks with Spectral Message Passing
(He et al., 2023) Message Passing Meets Graph Neural Networks: A New Paradigm for Massive MIMO Systems
(Shen et al., 2023) Domain-adaptive Message Passing Graph Neural Network
(Abbahaddou et al., 1 Sep 2025) ADMP-GNN: Adaptive Depth Message Passing GNN
(Sun et al., 2024) Towards Dynamic Message Passing on Graphs
(Shi et al., 27 Feb 2025) Accurate and Scalable Graph Neural Networks via Message Invariance
(Prabowo et al., 2023) Message Passing Neural Networks for Traffic Forecasting
(Zhong et al., 2020) Hierarchical Message-Passing Graph Neural Networks