Simplicial Attention Networks

Updated 31 May 2026

Simplicial Attention Networks (SAN) are neural architectures that apply masked self-attention to capture multi-way interactions on higher-order simplices.
SANs extend Graph Attention Networks by aggregating information from both lower and upper adjacencies, enhancing expressivity and task-driven learning.
They have demonstrated robust performance in areas such as trajectory prediction and missing data imputation while scaling to large, complex simplicial structures.

Simplicial Attention Networks (SAN) are neural architectures designed to operate on data defined over simplicial complexes by leveraging masked self-attention mechanisms. By generalizing the widely-used Graph Attention Network (GAT) framework, SANs enable flexible, learnable, and task-driven aggregation of information not only between nodes (0-simplices), but fundamentally between arbitrary higher-order simplices (edges, triangles, tetrahedra, etc.), accounting for both lower and upper adjacencies in the underlying topological domain. This extension captures multi-way interactions and allows more expressive and structure-aware message-passing, with rigorous connections to the principles of topological signal processing and Hodge theory.

1. Simplicial Complexes and Topological Foundations

A simplicial complex $X$ of maximum dimension $K$ is a collection of subsets $D_k$ of size $k+1$ from a finite vertex set $V$ , with the closure property that the inclusion of a $k$ -simplex $\sigma^k$ implies inclusion of all its faces (subsets of size $k$ ). The associated $k$ -chains $C_k(X)$ are real vector spaces over oriented $K$ 0-simplices, and $K$ 1-cochains $K$ 2 are their duals, i.e., real-valued signals over $K$ 3.

Boundary operators $K$ 4 are represented by signed incidence matrices $K$ 5, with entries $K$ 6 encoding the inclusion and orientation of $K$ 7-faces in $K$ 8-simplices. The combinatorial Hodge Laplacian at each order $K$ 9 is

$D_k$ 0

decomposing into lower (down) and upper (up) Laplacians related to face and coface interactions, respectively (Giusti et al., 2022, Giusti, 2024).

Lower adjacency relates $D_k$ 1-simplices sharing a common $D_k$ 2-face, while upper adjacency is defined when two $D_k$ 3-simplices are both faces of a common $D_k$ 4-simplex.

2. Self-Attention Mechanisms on Simplicial Complexes

SANs generalize masked self-attention from edges to arbitrary $D_k$ 5-simplices. For each simplex order $D_k$ 6 and SAN layer:

Feature Projections: Each input feature matrix $D_k$ 7 (where $D_k$ 8) is projected into queries, keys, and values via learned matrices for both lower (down) and upper (up) branches, and for multiple polynomial filter orders $D_k$ 9, $k+1$ 0. Concatenation yields $k+1$ 1, $k+1$ 2, $k+1$ 3 for the lower branch, and analogously $k+1$ 4, $k+1$ 5, $k+1$ 6 for upper.
Masked Attention: For each branch, attention is computed using

$k+1$ 7

$k+1$ 8

and likewise for the upper branch, with an independent attention vector $k+1$ 9. Attention is thus strictly supported over valid topological neighbors as encoded by the incidence structure.

Sparse Attentional Laplacians: These coefficients populate sparse matrices $V$ 0 and $V$ 1, which are then used for depth-wise polynomial filtering of the features, enabling multi-hop information propagation:

$V$ 2

$V$ 3

Optionally, a (possibly approximate) harmonic projection $V$ 4 can be included to account for features constant on cycles (important in topologically rich domains).

Feature Update and Multi-Head Aggregation: With or without multiple attention heads ( $V$ 5), the output is

$V$ 6

where $V$ 7 is a nonlinearity. Outputs from multiple heads are concatenated or averaged.

This mechanism is stacked in layers, optionally with skip/residual connections and layer normalization (Giusti et al., 2022, Giusti, 2024, Battiloro et al., 2023).

3. Algorithmic and Architectural Properties

SANs admit efficient vectorized implementation due to the sparsity of simplicial adjacency, using data structures and algorithms analogous to those in sparse GNN frameworks. The cost per layer is $V$ 8 per simplex, where $V$ 9 is the maximum neighborhood size.

Crucially, SANs are permutation equivariant: permuting vertex orderings induces corresponding reorderings in all $k$ 0-simplices, signal tensors, and incidence matrices, with the update equations commuting with this action (Battiloro et al., 2023). Simplicial-awareness holds, in that the network's outputs depend on the inclusion of higher-order simplices, not just on underlying graphs, enabling full exploitation of higher-order combinatorial structures.

Removing the attention masks and harmonic branch reduces SAN to various previously established SNN architectures—e.g., setting $k$ 1 and using combinatorial Laplacians yields simplicial convolutions [Ebli et al. 2020]; further simplification recovers message-passing simplicial nets [Bodnar et al. 2021]. Specializing $k$ 2 recovers standard GAT (Giusti et al., 2022).

SANs systematically generalize attention-based learning on graphs by operating on arbitrary simplex orders and learning separate attentional weights over upper and lower adjacency relations (Giusti et al., 2022, Giusti, 2024). In contrast, models such as GAT are restricted to masked attention over edge neighborhoods (1-simplices), and classical SNNs/Simplicial ConvNets apply non-adaptive, purely combinatorial aggregations.

SAT (Goh et al., 2022) differs from SAN by employing a single attention over all neighbors and omitting harmonic projections. SGAT (Lee et al., 2022) extends the SAN paradigm to heterogeneous graphs by constructing simplicial complexes in which higher-order simplices aggregate heterogeneous node and edge types, enabling attention-based aggregation across target-type cliques and their shared non-target neighbors via upper-adjacency.

Generalized Simplicial Attention Networks (GSAN) (Battiloro et al., 2023) further extend SAN to process data residing on both simplicial and cell complexes, impose Hodge-theoretic principles explicitly, and demonstrate provable permutation equivariance and order- $k$ 3 simplicial awareness.

5. Empirical Evaluation and Applications

SANs and their variants have been evaluated on a variety of tasks involving higher-order data:

Trajectory Prediction: On synthetic planar flow and real-world ocean drifter datasets, SAN achieves near $k$ 4 accuracy, outperforming MPSN, SCNN, and SAT baselines (Giusti et al., 2022, Battiloro et al., 2023).
Missing Data Imputation: On citation complexes with up to fifth-order simplices, SANs attain accuracy improvements of $k$ 5– $k$ 6 over SCNN, SNN, and SAT, showing greater robustness at high masking rates and high simplex order (Giusti et al., 2022, Battiloro et al., 2023).
Node and Graph Classification: SGAT (Lee et al., 2022) demonstrates state-of-the-art performance on heterogeneous node classification tasks (DBLP, ACM, IMDB), with Macro-F1 $k$ 7 and Micro-F1 $k$ 8 on DBLP, outperforming GAT, HAN, and metapath-based methods. When node features are replaced by random noise, SGAT still yields over $k$ 9 Micro-F1, underscoring the structural expressivity of simplicial attention.
Simplex Prediction: GSAN achieves top ROC-AUC on filled vs open simplex prediction in citation complexes, with $\sigma^k$ 0 ( $\sigma^k$ 1) and $\sigma^k$ 2 ( $\sigma^k$ 3) (Battiloro et al., 2023).

6. Implementation, Scalability, and Practical Considerations

SAN layers operate locally on sparse neighborhoods, supporting scalability to complexes with millions of simplices, provided neighborhood sizes are bounded. Batch and subgraph sampling techniques from GNNs can be applied to mitigate memory bottlenecks. Hyperparameters such as $\sigma^k$ 4 (receptive field depth), $\sigma^k$ 5 (number of heads), and $\sigma^k$ 6 (hidden channels) directly trade off expressive power and complexity.

The inclusion of the harmonic branch is beneficial when the application involves global cycles or homological features, but computation of sparse harmonic projectors can be costly if the Laplacian kernel is large; approximate methods (e.g., Chebyshev filtering) are applicable (Giusti et al., 2022). Extensions to time-varying complexes, positional encodings, and hierarchical pooling are plausible avenues, as suggested in the literature.

7. Limitations and Extensions

The primary strength of SANs lies in their principled, learnable, and topologically-intrinsic approach to multi-way relational modeling. However, direct global attention is not natively supported in the base SAN architecture, and harmonic projection computations may become prohibitive for highly-connected or large-scale complexes. Applying SANs to dynamic or evolving simplicial complexes, incorporating cell complexes beyond simplicial structures, and extending attention beyond local (upper/lower) neighborhoods remain active areas of research (Battiloro et al., 2023, Giusti, 2024).

In summary, Simplicial Attention Networks advance the expressive capacity of neural message-passing architectures to higher-order topological domains by introducing adaptive, interpretable attention mechanisms that operate coherently within the combinatorial and algebraic framework of simplicial complexes. This innovation enables effective learning and inference on data where interactions intrinsically transcend pairwise relations.

Markdown Report Issue Upgrade to Chat

References (5)

Simplicial Attention Neural Networks (2022)

Topological Neural Networks: Mitigating the Bottlenecks of Graph Neural Networks via Higher-Order Interactions (2024)

Generalized Simplicial Attention Neural Networks (2023)

Simplicial Attention Networks (2022)

SGAT: Simplicial Graph Attention Network (2022)

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Simplicial Attention Networks (SAN).