DirEgo2Token: Directed SSM Sequentialization

Updated 24 September 2025

DirEgo2Token is a sequentialization framework that constructs canonical, causal node sequences for directed graphs, advancing state space models.
It integrates k-hop ego graphs, DepthPlus positional and DirGatedGCN encodings to systematically capture long-range, direction-sensitive dependencies.
Empirical results show state-of-the-art accuracy and up to 2x training speed improvements, though scaling to dense graphs remains a challenge.

DirEgo2Token is a sequentialization framework designed to enable state space models (SSMs) to operate natively on directed graphs by producing canonical, causal sequences of nodes centered around each vertex. This approach systematically captures long-range, direction-sensitive dependencies, representing a methodological transition for graph machine learning—from reliance on undirected graph assumptions to fully exploiting directed edge information. DirEgo2Token is the cornerstone of the DirGraphSSM architecture, which demonstrates state-of-the-art empirical performance and improved training efficiency for directed graph learning tasks (She et al., 17 Sep 2025).

1. Foundational Principles and Definitions

DirEgo2Token operates on the premise that directed graphs encode causal relationships that must be preserved in learning architectures. For a graph $G=(V,E)$ and each central node $v\in V$ , the approach constructs a k-hop ego graph by collecting all predecessor nodes $u$ such that $SPD(u,v)\leq k$ , where $SPD$ denotes shortest path distance. The k-hop predecessor set is formally given by:

$\mathcal{N}_{in}^k(v) = \{u \in V \mid SPD(u, v) \leq k\}$

Nodes in $\mathcal{N}_{in}^k(v)$ are further grouped by hop distance:

$L_i = \{u \in \mathcal{N}_{in}^k(v) \mid SPD(u,v) = i\},\, i\in[0,k],\,L_0=\{v\}$

The causal sequence for node $v$ is thus:

$S_v = (L_k, L_{k-1}, \ldots, L_1, L_0)$

This sequentialization preserves directionality, is permutation invariant within hop groups, and encodes a causal “trace” from predecessors at distance $k$ down to $v$ . The process is generalized to both acyclic and cyclic graph topologies.

2. Architectural Integration: DirGraphSSM

The DirGraphSSM architecture implements state space models for directed graphs using the DirEgo2Token framework as its backbone. Key modules include:

DepthPlus Positional Encoding: Assigns hierarchical depth to nodes, computed via Tarjan’s algorithm for condensation into acyclic graphs, then propagated to all original nodes.
DirGatedGCN Encodings: Locally refines node features by aggregating incoming and outgoing neighbor features with learnable gating.
Digraph SSM Scan: Applies an SSM convolution kernel of the form $\mathbf{\hat{C}}\cdot(\mathbf{A}^k)\cdot\mathbf{B}$ across the causal sequence for each node, enabling distance-aware information aggregation.
Digraph Fusion Attention: Integrates multi-head attention outputs to fuse various message paths and feature dimensions.

Message passing in DirGraphSSM is parallelized over k-hop ego graphs, with each predecessor $u$ contributing via an attention mechanism:

$\alpha_{u,v} = \frac{\kappa(x_v,x_u)}{\sum_{w\in\mathcal{N}_{in}^k(v)}\kappa(x_v,x_w)}$

$y_v = \sum_{u\in\mathcal{N}_{in}^k(v)}\alpha_{u,v}\cdot SSM^{SPD(u,v)}(f(x_u)\mathbf{W}_V)$

The attention kernel is:

$\kappa(x_v, x_u) = \exp\left(\frac{\langle f(x_v)\mathbf{W}_Q, f(x_u)\mathbf{W}_K \rangle}{\sqrt{d_k}}\right)$

3. Handling Hierarchy and Cycles

DirEgo2Token’s hierarchical encoding extends beyond DAGs to cyclic graphs by decomposing them into strongly connected components (SCCs). Tarjan’s algorithm is used to form a condensation DAG, depths are assigned, and propagated back to nodes in the original graph, thereby enabling hierarchical positional encodings even in the presence of cycles. For DAGs, node depth is computed recursively:

$depth(v)= \begin{cases} 0 & \text{if indegree}(v)=0 \ 1 + \max\{depth(u) \mid (u,v)\in E\} & \text{otherwise} \end{cases}$

4. Empirical Performance and Efficiency

DirGraphSSM equipped with DirEgo2Token achieves state-of-the-art or competitive accuracy on diverse directed graph benchmarks, including ogbg-code2, NA, self-citation for DAGs, and MalNet-Tiny, EDA-HLS for cyclic graphs. On the ogbg-code2 benchmark, DirGraphSSM attains F1 scores competitive with leading methods, with reported speedups of $1.5\times$ to $2\times$ in training epoch durations compared to existing graph state space models, such as GMN and Graph-Mamba.

The efficiency is attributed to the parallelized SSM scan over k-hop neighborhoods and avoidance of padding for variable-length ego sequences. This makes DirEgo2Token practical for large-scale and sparse directed graphs.

5. Limitations and Prospective Extensions

A known limitation is scalability to dense graphs: As k increases, the size of the ego sequence can grow rapidly, posing computational challenges. Adaptive techniques for sequence length or neighborhood sampling may be necessary to mitigate this overhead. Furthermore, the current formulation does not accommodate edge labels or multi-relational graphs; extending SSMs for such heterogeneity is a proposed area of future investigation.

Optimal selection of k-hop radius remains an open question—future research may focus on data-driven or dynamically learned neighborhood sizes to balance dependence capture and computational cost.

6. Context Within Graph Learning Research

DirEgo2Token marks the first systematic extension of SSMs to directed graphs, bridging causal sequence modeling and message passing. Preceding work in graph SSMs operated exclusively on undirected graphs, limiting the ability to model directional dependencies. By sequentializing directed neighborhoods, DirEgo2Token enables SSMs to natively capture causal flows, offering a generalization pathway for architectures such as SSM-Mamba and GMN.

The combined use of local DirGatedGCN encodings and global DepthPlus positional encodings situates DirGraphSSM as a hybrid approach, integrating multiple structural scales. Probabilistic and attention-based message aggregation further refine the representations, supporting diverse graph learning tasks including node classification, regression, and representation learning.

7. Mathematical Structure and Implementation

The mathematical structure underlying DirEgo2Token’s sequentialization and aggregation is given explicitly by the equations above. Implementation utilizes efficient graph traversal queries for predecessor discovery, Tarjan condensation for hierarchy, and parallelized kernel application via message-passing protocols. The attention mechanism and SSM kernel composition are parameterized for each hop distance, offering fine-grained control over the influence propagation.

A plausible implication is that DirEgo2Token’s sequence construction may be adaptable to other sequential modeling paradigms, given its canonical yet permutation-invariant ordering scheme. The blending of SSM scan with graph attention mechanisms offers a basis for further theoretical and empirical paper.

DirEgo2Token provides an operational paradigm shift in directed graph learning, allowing SSMs to exploit causal structure in message passing architectures. The framework’s empirical robustness and speed suggest broad applicability across citation networks, electronic design automation, and any domain characterized by directed, causal graph data.

PDF Markdown Chat (Pro)

References (1)

State Space Models over Directed Graphs (2025)

Follow Topic

Get notified by email when new papers are published related to DirEgo2Token.