DirGraphSSM: SSMs for Directed Graphs
- DirGraphSSM is a graph neural network framework that fuses state space models with directed message-passing to efficiently capture long-range causal dependencies.
- It uses the DirEgo2Token mechanism to sequentialize k-hop in-neighborhoods based on shortest-path distances, enabling parallel, permutation-invariant processing.
- Experimental results show state-of-the-art accuracy and 1.5x to 2x faster training speeds on benchmarks such as ogbg-code2 and MalNet-Tiny.
DirGraphSSM is a class of graph neural network models and statistical frameworks that apply state space models (SSMs) to directed graphs, enabling efficient learning of long-range causal dependencies and achieving high computational efficiency in large-scale directed graph settings. The core innovation is the integration of SSM computational kernels—originally successful in sequence modeling—directly into message-passing architectures over directed graphs, addressing the challenges unique to directed network structures such as causal information flow, asymmetry, and scalable parallelization (She et al., 17 Sep 2025).
1. Framework Overview and Novelty
DirGraphSSM systematically extends SSMs to directed graphs by leveraging a novel sequentialization of directed neighborhoods and a parallel message-passing formulation. The architecture relies on the DirEgo2Token module, which organizes each node’s k-hop in-neighborhood using shortest-path distance (SPD) and constructs a local sequence: for node , the -th group contains all nodes at directed distance from . The sequence enables processing of each node’s causal predecessors in SPD order, reflecting true causal structure without arbitrary global ordering.
Rather than a purely sequential or attention-based scan, DirGraphSSM reframes the SSM recurrence as a message-passing kernel: messages from directed -hop neighbors are aggregated using a multi-head attention mechanism, and each message is filtered by an SSM kernel parameterized by hop count. This design enables both efficient parallel computation and localized incorporation of long-range, directional dependencies.
2. Directed Ego-Graph Sequentialization (DirEgo2Token)
The DirEgo2Token mechanism formalizes the causal sequence over a directed graph. For , a positive integer:
- For each ,
This construction ensures permutation invariance, as each node’s local view is independent of any external node ordering. Information flows from the most distant causal predecessors toward , affording a direct way to capture and process long-range dependencies that are crucial in directed graph applications.
3. Message-Passing SSM Scan and Aggregation
DirGraphSSM realizes SSM computation over the directed graph via a scan over , employing
- A multi-head attention aggregate within each :
where , and includes structural and positional encodings.
- Processing the resulting sequence with an SSM convolution kernel:
This is mathematically equivalent to a message-passing protocol:
where is the normalized attention coefficient and SSM denotes the kernel applied with a delay .
4. Mathematical Formulation
The key equations and computational steps include:
- k-hop in-neighborhood:
- Grouping by hop distance:
- Multilayer attention-based aggregation:
- SSM scan over structurally ordered sequence, efficiently computed as a convolution with hop-dependent SSM kernels.
These modules are fully parallelizable across nodes and hops, in contrast to inefficient sequence-based handling with explicit padding.
5. Experimental Results and Performance
DirGraphSSM achieves state-of-the-art (SOTA) accuracy and significant speedup on representative directed graph benchmarks, including classification on ogbg-code2 (Python ASTs), regression on the NA architecture dataset, node classification (self-citation), and classification/regression on directed cyclic graphs (MalNet-Tiny, EDA-HLS). Examples from the results:
- On ogbg-code2, DirGraphSSM achieves test F1 in the 20.5% range, highly competitive with non-SSM models.
- DirGraphSSM attains 1.5 to 2 faster training speeds per epoch compared with the best graph Transformer/Mamba variants.
- Ablations confirm the importance of each module (e.g., DepthPlus positional encoding, DirGatedGCN, Digraph Fusion Attention) for both accuracy and information propagation.
- The message-passing SSM formulation eliminates padding overhead and scales efficiently to large graphs (She et al., 17 Sep 2025).
6. Applications and Implications
DirGraphSSM is particularly suitable for domains requiring accurate, scalable, and interpretable modeling of directed dependencies:
- Code analysis (e.g., program graphs, ASTs in code2)
- Neural architecture performance regression (structural causal modeling in NAS)
- Information flow in social and communication networks (where directionality encodes causality)
- Malware detection and security (MalNet-Tiny, where call graphs are directed and cyclic)
- Hardware synthesis graph analysis (EDA-HLS)
The principle of sequentializing local directed neighborhoods with k-hop ego graphs and processing them via SSM kernels could generalize to a wide range of causal inference and signal processing applications on directed graphs.
7. Significance and Future Perspectives
DirGraphSSM marks the first systematic integration of modern state space models with directed graph learning, offering a principled and efficient architecture for capturing long-range causal dependencies (She et al., 17 Sep 2025). Its design aligns the strengths of SSMs (memory, stability, long-term information propagation) with the needs of directed message-passing frameworks. Further development could focus on:
- Deeper theoretical analysis of stability, spectral properties, and expressivity in directed SSM-GNNs
- Extension to temporal or dynamic directed graphs
- Adaptation of architecture for highly heterogeneous or multi-relational directed graphs
The permutation-invariant, parallelizable treatment highlights a clear direction for future research at the intersection of causal graph modeling, sequence learning, and scalable deep learning.