DirGraphSSM: SSMs for Directed Graphs

Updated 24 September 2025

DirGraphSSM is a graph neural network framework that fuses state space models with directed message-passing to efficiently capture long-range causal dependencies.
It uses the DirEgo2Token mechanism to sequentialize k-hop in-neighborhoods based on shortest-path distances, enabling parallel, permutation-invariant processing.
Experimental results show state-of-the-art accuracy and 1.5x to 2x faster training speeds on benchmarks such as ogbg-code2 and MalNet-Tiny.

DirGraphSSM is a class of graph neural network models and statistical frameworks that apply state space models (SSMs) to directed graphs, enabling efficient learning of long-range causal dependencies and achieving high computational efficiency in large-scale directed graph settings. The core innovation is the integration of SSM computational kernels—originally successful in sequence modeling—directly into message-passing architectures over directed graphs, addressing the challenges unique to directed network structures such as causal information flow, asymmetry, and scalable parallelization (She et al., 17 Sep 2025).

1. Framework Overview and Novelty

DirGraphSSM systematically extends SSMs to directed graphs by leveraging a novel sequentialization of directed neighborhoods and a parallel message-passing formulation. The architecture relies on the DirEgo2Token module, which organizes each node’s k-hop in-neighborhood using shortest-path distance (SPD) and constructs a local sequence: for node $v$ , the $i$ -th group $L_i$ contains all nodes at directed distance $i$ from $v$ . The sequence $S_v = (L_k, ..., L_0)$ enables processing of each node’s causal predecessors in SPD order, reflecting true causal structure without arbitrary global ordering.

Rather than a purely sequential or attention-based scan, DirGraphSSM reframes the SSM recurrence as a message-passing kernel: messages from directed $k$ -hop neighbors are aggregated using a multi-head attention mechanism, and each message is filtered by an SSM kernel parameterized by hop count. This design enables both efficient parallel computation and localized incorporation of long-range, directional dependencies.

2. Directed Ego-Graph Sequentialization (DirEgo2Token)

The DirEgo2Token mechanism formalizes the causal sequence over a directed graph. For $v \in V$ , $k$ a positive integer:

$𝒩_\text{in}^k(v) = \{u \mid \text{SPD}(u, v) \leq k\}$
For each $i \in [0, k]$ , $L_i = \{u \mid \text{SPD}(u, v) = i\}$
$S_v = (L_k, ..., L_0)$

This construction ensures permutation invariance, as each node’s local view is independent of any external node ordering. Information flows from the most distant causal predecessors toward $v$ , affording a direct way to capture and process long-range dependencies that are crucial in directed graph applications.

3. Message-Passing SSM Scan and Aggregation

DirGraphSSM realizes SSM computation over the directed graph via a scan over $S_v$ , employing

A multi-head attention aggregate within each $L_i$ :

$\text{Aggregate}(L_i, v) = \sum_{u \in L_i} \frac{\kappa(x_v, x_u)}{\sum_{w \in 𝒩_\text{in}^k(v)} \kappa(x_v, x_w)} f(x_u)$

where $\kappa(x_v, x_u) = \exp\left(\langle f(x_v)W_Q,\ f(x_u)W_K\rangle / \sqrt{d_k}\right)$ , and $f(x)$ includes structural and positional encodings.

Processing the resulting sequence $(z_k, ..., z_0)$ with an SSM convolution kernel:

$y_v = \text{SSM\_conv}(z_k, ..., z_0)$

This is mathematically equivalent to a message-passing protocol:

$y_v = \sum_{u \in 𝒩_\text{in}^k(v)} \alpha_{u,v}\ \text{SSM}^{(\text{spd}(u, v))}(f(x_u)W_V)$

where $\alpha_{u,v}$ is the normalized attention coefficient and SSM $^{(d)}$ denotes the kernel applied with a delay $d$ .

4. Mathematical Formulation

The key equations and computational steps include:

k-hop in-neighborhood: $𝒩_\text{in}^k(v) = \{u \mid \text{SPD}(u, v) \leq k\}$
Grouping by hop distance: $L_i = \{u \in 𝒩_\text{in}^k(v) \mid \text{SPD}(u, v)=i\}$
Multilayer attention-based aggregation:

$\text{Aggregate}(L_i, v) = \sum_{u \in L_i} \left[ \frac{\kappa(x_v, x_u)}{\sum_{w\in𝒩_\text{in}^k(v)} \kappa(x_v, x_w)} \right] f(x_u)$

SSM scan over structurally ordered sequence, efficiently computed as a convolution with hop-dependent SSM kernels.

These modules are fully parallelizable across nodes and hops, in contrast to inefficient sequence-based handling with explicit padding.

5. Experimental Results and Performance

DirGraphSSM achieves state-of-the-art (SOTA) accuracy and significant speedup on representative directed graph benchmarks, including classification on ogbg-code2 (Python ASTs), regression on the NA architecture dataset, node classification (self-citation), and classification/regression on directed cyclic graphs (MalNet-Tiny, EDA-HLS). Examples from the results:

On ogbg-code2, DirGraphSSM achieves test F1 in the 20.5% range, highly competitive with non-SSM models.
DirGraphSSM attains 1.5 $\times$ to 2 $\times$ faster training speeds per epoch compared with the best graph Transformer/Mamba variants.
Ablations confirm the importance of each module (e.g., DepthPlus positional encoding, DirGatedGCN, Digraph Fusion Attention) for both accuracy and information propagation.
The message-passing SSM formulation eliminates padding overhead and scales efficiently to large graphs (She et al., 17 Sep 2025).

6. Applications and Implications

DirGraphSSM is particularly suitable for domains requiring accurate, scalable, and interpretable modeling of directed dependencies:

Code analysis (e.g., program graphs, ASTs in code2)
Neural architecture performance regression (structural causal modeling in NAS)
Information flow in social and communication networks (where directionality encodes causality)
Malware detection and security (MalNet-Tiny, where call graphs are directed and cyclic)
Hardware synthesis graph analysis (EDA-HLS)

The principle of sequentializing local directed neighborhoods with k-hop ego graphs and processing them via SSM kernels could generalize to a wide range of causal inference and signal processing applications on directed graphs.

7. Significance and Future Perspectives

DirGraphSSM marks the first systematic integration of modern state space models with directed graph learning, offering a principled and efficient architecture for capturing long-range causal dependencies (She et al., 17 Sep 2025). Its design aligns the strengths of SSMs (memory, stability, long-term information propagation) with the needs of directed message-passing frameworks. Further development could focus on:

Deeper theoretical analysis of stability, spectral properties, and expressivity in directed SSM-GNNs
Extension to temporal or dynamic directed graphs
Adaptation of architecture for highly heterogeneous or multi-relational directed graphs

The permutation-invariant, parallelizable treatment highlights a clear direction for future research at the intersection of causal graph modeling, sequence learning, and scalable deep learning.

PDF Markdown Chat (Pro)

References (1)

State Space Models over Directed Graphs (2025)

Follow Topic

Get notified by email when new papers are published related to DirGraphSSM.