Papers
Topics
Authors
Recent
Search
2000 character limit reached

Simplicial Attention Networks

Updated 31 May 2026
  • Simplicial Attention Networks (SAN) are neural architectures that apply masked self-attention to capture multi-way interactions on higher-order simplices.
  • SANs extend Graph Attention Networks by aggregating information from both lower and upper adjacencies, enhancing expressivity and task-driven learning.
  • They have demonstrated robust performance in areas such as trajectory prediction and missing data imputation while scaling to large, complex simplicial structures.

Simplicial Attention Networks (SAN) are neural architectures designed to operate on data defined over simplicial complexes by leveraging masked self-attention mechanisms. By generalizing the widely-used Graph Attention Network (GAT) framework, SANs enable flexible, learnable, and task-driven aggregation of information not only between nodes (0-simplices), but fundamentally between arbitrary higher-order simplices (edges, triangles, tetrahedra, etc.), accounting for both lower and upper adjacencies in the underlying topological domain. This extension captures multi-way interactions and allows more expressive and structure-aware message-passing, with rigorous connections to the principles of topological signal processing and Hodge theory.

1. Simplicial Complexes and Topological Foundations

A simplicial complex XX of maximum dimension KK is a collection of subsets DkD_k of size k+1k+1 from a finite vertex set VV, with the closure property that the inclusion of a kk-simplex σk\sigma^k implies inclusion of all its faces (subsets of size kk). The associated kk-chains Ck(X)C_k(X) are real vector spaces over oriented KK0-simplices, and KK1-cochains KK2 are their duals, i.e., real-valued signals over KK3.

Boundary operators KK4 are represented by signed incidence matrices KK5, with entries KK6 encoding the inclusion and orientation of KK7-faces in KK8-simplices. The combinatorial Hodge Laplacian at each order KK9 is

DkD_k0

decomposing into lower (down) and upper (up) Laplacians related to face and coface interactions, respectively (Giusti et al., 2022, Giusti, 2024).

Lower adjacency relates DkD_k1-simplices sharing a common DkD_k2-face, while upper adjacency is defined when two DkD_k3-simplices are both faces of a common DkD_k4-simplex.

2. Self-Attention Mechanisms on Simplicial Complexes

SANs generalize masked self-attention from edges to arbitrary DkD_k5-simplices. For each simplex order DkD_k6 and SAN layer:

  • Feature Projections: Each input feature matrix DkD_k7 (where DkD_k8) is projected into queries, keys, and values via learned matrices for both lower (down) and upper (up) branches, and for multiple polynomial filter orders DkD_k9, k+1k+10. Concatenation yields k+1k+11, k+1k+12, k+1k+13 for the lower branch, and analogously k+1k+14, k+1k+15, k+1k+16 for upper.
  • Masked Attention: For each branch, attention is computed using

k+1k+17

k+1k+18

and likewise for the upper branch, with an independent attention vector k+1k+19. Attention is thus strictly supported over valid topological neighbors as encoded by the incidence structure.

  • Sparse Attentional Laplacians: These coefficients populate sparse matrices VV0 and VV1, which are then used for depth-wise polynomial filtering of the features, enabling multi-hop information propagation:

VV2

VV3

Optionally, a (possibly approximate) harmonic projection VV4 can be included to account for features constant on cycles (important in topologically rich domains).

  • Feature Update and Multi-Head Aggregation: With or without multiple attention heads (VV5), the output is

VV6

where VV7 is a nonlinearity. Outputs from multiple heads are concatenated or averaged.

This mechanism is stacked in layers, optionally with skip/residual connections and layer normalization (Giusti et al., 2022, Giusti, 2024, Battiloro et al., 2023).

3. Algorithmic and Architectural Properties

SANs admit efficient vectorized implementation due to the sparsity of simplicial adjacency, using data structures and algorithms analogous to those in sparse GNN frameworks. The cost per layer is VV8 per simplex, where VV9 is the maximum neighborhood size.

Crucially, SANs are permutation equivariant: permuting vertex orderings induces corresponding reorderings in all kk0-simplices, signal tensors, and incidence matrices, with the update equations commuting with this action (Battiloro et al., 2023). Simplicial-awareness holds, in that the network's outputs depend on the inclusion of higher-order simplices, not just on underlying graphs, enabling full exploitation of higher-order combinatorial structures.

Removing the attention masks and harmonic branch reduces SAN to various previously established SNN architectures—e.g., setting kk1 and using combinatorial Laplacians yields simplicial convolutions [Ebli et al. 2020]; further simplification recovers message-passing simplicial nets [Bodnar et al. 2021]. Specializing kk2 recovers standard GAT (Giusti et al., 2022).

SANs systematically generalize attention-based learning on graphs by operating on arbitrary simplex orders and learning separate attentional weights over upper and lower adjacency relations (Giusti et al., 2022, Giusti, 2024). In contrast, models such as GAT are restricted to masked attention over edge neighborhoods (1-simplices), and classical SNNs/Simplicial ConvNets apply non-adaptive, purely combinatorial aggregations.

SAT (Goh et al., 2022) differs from SAN by employing a single attention over all neighbors and omitting harmonic projections. SGAT (Lee et al., 2022) extends the SAN paradigm to heterogeneous graphs by constructing simplicial complexes in which higher-order simplices aggregate heterogeneous node and edge types, enabling attention-based aggregation across target-type cliques and their shared non-target neighbors via upper-adjacency.

Generalized Simplicial Attention Networks (GSAN) (Battiloro et al., 2023) further extend SAN to process data residing on both simplicial and cell complexes, impose Hodge-theoretic principles explicitly, and demonstrate provable permutation equivariance and order-kk3 simplicial awareness.

5. Empirical Evaluation and Applications

SANs and their variants have been evaluated on a variety of tasks involving higher-order data:

  • Trajectory Prediction: On synthetic planar flow and real-world ocean drifter datasets, SAN achieves near kk4 accuracy, outperforming MPSN, SCNN, and SAT baselines (Giusti et al., 2022, Battiloro et al., 2023).
  • Missing Data Imputation: On citation complexes with up to fifth-order simplices, SANs attain accuracy improvements of kk5–kk6 over SCNN, SNN, and SAT, showing greater robustness at high masking rates and high simplex order (Giusti et al., 2022, Battiloro et al., 2023).
  • Node and Graph Classification: SGAT (Lee et al., 2022) demonstrates state-of-the-art performance on heterogeneous node classification tasks (DBLP, ACM, IMDB), with Macro-F1 kk7 and Micro-F1 kk8 on DBLP, outperforming GAT, HAN, and metapath-based methods. When node features are replaced by random noise, SGAT still yields over kk9 Micro-F1, underscoring the structural expressivity of simplicial attention.
  • Simplex Prediction: GSAN achieves top ROC-AUC on filled vs open simplex prediction in citation complexes, with σk\sigma^k0 (σk\sigma^k1) and σk\sigma^k2 (σk\sigma^k3) (Battiloro et al., 2023).

6. Implementation, Scalability, and Practical Considerations

SAN layers operate locally on sparse neighborhoods, supporting scalability to complexes with millions of simplices, provided neighborhood sizes are bounded. Batch and subgraph sampling techniques from GNNs can be applied to mitigate memory bottlenecks. Hyperparameters such as σk\sigma^k4 (receptive field depth), σk\sigma^k5 (number of heads), and σk\sigma^k6 (hidden channels) directly trade off expressive power and complexity.

The inclusion of the harmonic branch is beneficial when the application involves global cycles or homological features, but computation of sparse harmonic projectors can be costly if the Laplacian kernel is large; approximate methods (e.g., Chebyshev filtering) are applicable (Giusti et al., 2022). Extensions to time-varying complexes, positional encodings, and hierarchical pooling are plausible avenues, as suggested in the literature.

7. Limitations and Extensions

The primary strength of SANs lies in their principled, learnable, and topologically-intrinsic approach to multi-way relational modeling. However, direct global attention is not natively supported in the base SAN architecture, and harmonic projection computations may become prohibitive for highly-connected or large-scale complexes. Applying SANs to dynamic or evolving simplicial complexes, incorporating cell complexes beyond simplicial structures, and extending attention beyond local (upper/lower) neighborhoods remain active areas of research (Battiloro et al., 2023, Giusti, 2024).

In summary, Simplicial Attention Networks advance the expressive capacity of neural message-passing architectures to higher-order topological domains by introducing adaptive, interpretable attention mechanisms that operate coherently within the combinatorial and algebraic framework of simplicial complexes. This innovation enables effective learning and inference on data where interactions intrinsically transcend pairwise relations.

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Simplicial Attention Networks (SAN).