Adaptive Graph Rewiring in Mesh GNNs

Updated 23 November 2025

The paper introduces a dynamic, layerwise rewiring mechanism that injects nonlocal connections based on geometric and velocity metrics to mitigate over-squashing in mesh-based GNNs.
It employs an adaptive delay scoring function to schedule edge injection in line with physical propagation delays, enhancing the model's prediction accuracy in fluid dynamics.
Empirical evaluations show that AdaMeshNet reduces velocity prediction error by up to 20% and further improves MSE over static rewiring methods with only modest runtime overhead.

Adaptive Graph Rewiring in Mesh-Based Graph Neural Networks (AdaMeshNet) refers to a graph-topology adaptation methodology specifically designed to mitigate the over-squashing pathology endemic to mesh-based GNNs applied to fluid dynamics simulations and related domains. Unlike conventional static rewiring, which globally adds long-range graph edges based solely on precomputed topological or spectral heuristics, AdaMeshNet implements a dynamic scheme: it injects new topologically nonlocal connections at message-passing layers whose depths are commensurate with the underlying physical delay—modeling the sequential, finite-speed propagation of physical interactions across the mesh. The key technical innovation is to parameterize this "rewiring delay" by combining geometric (shortest-path) distance and dynamic (velocity-difference) metrics, resulting in a per-edge, fractional-depth scheduling mechanism that tightly couples the computational graph to the underlying physical transport process (Seo et al., 16 Nov 2025).

1. Over-Squashing in Mesh-Based GNNs and Motivation

Over-squashing describes the exponential decay of information propagated through long chains of nodes in a sparse mesh, formalized by the bound

$\left|\frac{\partial h_i^{(r)}}{\partial x_s}\right| \le (\alpha_e\beta_h)^r (\hat A^r)_{is}$

where $h_i^{(r)}$ is the node representation at layer $r$ , $\hat A$ is the normalized adjacency, and $\alpha_e,\beta_h<1$ are contraction constants. In mesh-based GNNs (e.g., MeshGraphNet), refinement concentrates mesh resolution in high-gradient regions, increasing the graph's diameter and exposing the core bottleneck: information originating far from a node must traverse long, narrow topological channels, compressing distant physical effects into vanishingly small Jacobians. Static graph rewiring (via curvature, spectral gap maximization, Delaunay triangulation, or diffusion distances) attempts to circumvent this by globally inserting edges before any message passing. However, such global, instantaneously-applied shortcuts are physically unrealistic—they allow flow information to propagate with infinite velocity, violating the finite-speed nature of fluid dynamical and physical signal transport. The mesh's spatial resolution, in reality, constrains interaction arrival times—requiring their introduction only at the appropriate computation depth (Seo et al., 16 Nov 2025).

2. Mathematical Formulation of Adaptive Rewiring Delay

AdaMeshNet identifies "bottleneck" nodes using the average Ollivier–Ricci curvature, computed as

$\gamma_i = \frac{1}{|\mathcal N(i)|} \sum_{j \in \mathcal N(i)} \kappa(i,j)$

where $\kappa(i,j)$ denotes edge-level curvature. The subset $B$ of nodes with lowest $\gamma_i$ (typically the lower $\alpha$ -percentile) are bottlenecks. For each $i \in B$ , a distant partner $i^*$ is chosen (minimizing curvature in an $r$ -hop neighborhood). The rewiring delay score is then defined by

$s_{\rm delay}(i,i^*) = \beta\, d_{\mathcal G}(i,i^*) + (1 - \beta)\, \| v_i - v_{i^*} \|_2$

where $d_{\mathcal G}(i, i^*)$ is the shortest-path distance in the original mesh and $v$ is the node velocity vector. The hyperparameter $\beta \in [0,1]$ trades off geometric delay (signal path length) against dynamic delay (relative node velocities). By allowing $s_{\rm delay}(i,i^*) \in [0, L]$ to be fractional (for $L$ message-passing layers), the edge $(i, i^*)$ is scheduled for injection after the $\ell$ -th layer, with $\ell < s_{\rm delay}(i,i^*) \leq \ell+1$ . Thus, edge creation becomes a function of both spatial separation and local physical flux, yielding a physically-motivated, layer-wise rewiring process (Seo et al., 16 Nov 2025).

3. Layerwise Dynamic Edge Injection Algorithm

The AdaMeshNet algorithm executes as follows. In each time step:

Preprocessing:

Edgewise Ricci curvature $\kappa(i,j)$ and nodewise averages $\gamma_i$ are computed.
The bottom $\alpha$ fraction of $\gamma_i$ marks bottleneck nodes.
For each $i$ (bottleneck), partner $i^*$ is determined by minimal local curvature.
$s_{\rm delay}(i,i^*)$ is evaluated as above, and the scheduling index $\ell(i, i^*) = \lfloor s_{\rm delay}(i,i^*) \rfloor$ is stored.

Message Passing and Rewiring:

For each graph convolution layer $\ell = 0 \ldots L-1$ :

Newly scheduled edges $(i, i^*)$ with $\ell(i, i^*) = \ell$ are injected into the mesh topology.
Message updates at layer $\ell+1$ $ℓ + 1$ use the expanded adjacency, aggregating across both mesh and new rewired neighbors:
- Edge update: $e_{ij}^{(\ell+1)} = f_E( e_{ij}^{(\ell)}, h_i^{(\ell)}, h_j^{(\ell)} )$
- Node update: $h_i^{(\ell+1)} = f_V( h_i^{(\ell)}, \sum_{j\in N_i^{(\ell+1)}} e_{ij}^{(\ell+1)} )$

Decoding and Mesh Update:

After $L$ layers, node embeddings are decoded to predict next-step velocities and mesh states. Remeshing and feature updates then prepare the graph for the next simulation time step (Seo et al., 16 Nov 2025).

4. Empirical Evaluation and Comparative Performance

AdaMeshNet was benchmarked on:

Cylinder Flow (laminar): 1,000 simulations, 600 time steps
Airfoil (turbulent): 1,000 simulations, 600 time steps

Evaluation metrics include mean squared error (MSE) on velocity fields and velocity gradients at the next time step. The baselines were MeshGraphNet (MGN, no rewiring), PIORF (static spectral rewiring), FOSR, Delaunay rewiring, and static curvature-based rewiring.

Key findings:

AdaMeshNet reduces velocity-prediction error by 10–20% over MeshGraphNet.
Relative to the strongest static rewiring (PIORF), AdaMeshNet further reduces MSE by approximately 5%.
Gains are maximal for turbulent Airfoil flow, indicating the critical importance of delayed, sequential interaction modeling in systems with long-range, chaotic dependencies.
Runtime overhead for per-layer rewiring is modest (5–15%) and offset by the significant empirical accuracy improvements (Seo et al., 16 Nov 2025).

5. Sequential Propagation: Theoretical Significance

The scheduling of edge injection by $s_{\rm delay}$ aligns the computational graph topology with the physical system's causal structure. Each additional unit of path length or velocity difference imposes at least one more GNN layer's worth of propagation delay. Thus, AdaMeshNet simulates the finite-speed, sequential accumulation of signals, in contrast to static rewiring methods, which effectively collapse physical distance to zero at initialization.

This correspondence with physical signal transport is grounded in the observation that newly introduced edges at deeper layers break the over-squashing bottleneck only when the physically-justifiable information has had time to reach that node, conforming to physically realistic causality and preserving geometric mesh fidelity (Seo et al., 16 Nov 2025).

6. Relation to Existing Adaptive Mesh GNNs and Rewiring Strategies

Conventional adaptive mesh-based GNNs (e.g., MeshGraphNet (Pfaff et al., 2020), ADAPT-GNN (Perera et al., 2022), and multiscale AMR-based pipelines (Perera et al., 2024)) dynamically update graph topology in response to predicted local error, phase field, or geometric indicators, but insert or remove edges/nodes solely based on spatial refinement criteria. Static curvature- or spectral-based rewiring (e.g., PIORF (Yu et al., 5 Apr 2025)) computes bottlenecks and adds long-range connections globally prior to message passing, which does not reflect physical delays.

AdaMeshNet's distinctive contribution is to modulate the timing of edge introduction—matching the progressive, finite-velocity arrival of physical influences, rather than assuming instantaneous, global accessibility. This approach can be considered a discretization of physical causality in mesh-based GNN architectures, providing a theoretical and empirical advancement over both static rewiring and purely local mesh adaptation (Seo et al., 16 Nov 2025).

7. Summary Table: Comparative Perspectives

Method	Rewiring Timing	Physical Delay Modeled	Key Mechanism
Static Curvature/Spectral (PIORF, etc.)	Once, before GNN rollouts	No, all shortcuts present initially	Bottleneck/gradient-driven edge add.
Generic Adaptive Remeshing (MGN, AMR)	Per-timestep, by mesh ops	Indirect (via mesh refinement only)	Local error signals, mesh resizing.
AdaMeshNet	Per-GNN-layer, scheduled	Yes, via $s_\mathrm{delay}$	Layerwise adaptive edge injection

AdaMeshNet, by integrating topological bottleneck detection, velocity-aware delay scoring, and depth-scheduled edge injection, achieves a physically plausible, empirically validated mitigation of over-squashing in mesh-based GNNs for fluid dynamics simulation, outstripping static rewiring and mesh-only adaptation in both interpretability and predictive performance (Seo et al., 16 Nov 2025).