Sheaf Hypergraph Networks

Updated 8 May 2026

Sheaf Hypergraph Networks are advanced neural architectures that employ cellular sheaf theory and hypergraph structures to model complex, multi-directional relationships.
They integrate trainable restriction maps and a sheaf hypergraph Laplacian to effectively capture heterophilic, anisotropic, and multi-way interactions in data.
Empirical evaluations demonstrate state-of-the-art performance on node classification benchmarks, generalizing traditional graph and hypergraph operators.

Sheaf Hypergraph Networks are a class of neural architectures that integrate the mathematical formalism of cellular sheaves with hypergraph and simplicial set structures to enable expressive processing of higher-order and possibly directed relational data. By enriching classical hypergraph Laplacians with trainable, per-incidence restriction maps and exploiting the Dirichlet energy induced by sheaf theory, these methods provide an expressive inductive bias tailored to heterophilic, anisotropic, and multi-way interactions encountered in real-world structured data. They subsume standard hypergraph, graph, and simplicial neural operators as limiting cases and have demonstrated state-of-the-art performance on node classification and related tasks across a variety of settings.

1. Mathematical Foundations

A hypergraph $\mathcal{H}=(V,E)$ consists of a finite set of vertices $V$ and a collection of hyperedges $E\subseteq 2^V\setminus\{\emptyset\}$ . In the directed setting, each hyperedge $e\in E$ is partitioned into a (possibly empty) tail $T(e)$ and head $H(e)$ , $T(e)\,\dot\cup\,H(e)=e$ .

A cellular sheaf $\mathcal{F}$ on $\mathcal{H}$ assigns a finite-dimensional (typically $d$ -dimensional) complex vector space to each vertex and hyperedge: $V$ 0 with linear restriction maps $V$ 1 for each incidence $V$ 2. For directed hyperedges, a direction-dependent complex phase $V$ 3 modifies each restriction: $V$ 4 The sheaf restriction data generalizes the role of weights in classical message passing, enabling expressive, learning-based control of information flow over complex relational domains (Mule et al., 6 Oct 2025).

2. Sheaf Hypergraph Laplacians and Generalizations

The core operator motivating sheaf hypergraph networks is the (normalized) sheaf hypergraph Laplacian. For both undirected and directed cases, a block-incidence matrix $V$ 5 is formed from the (phase-modulated) restriction maps, and degree matrices $V$ 6 and $V$ 7 are assembled using node and hyperedge degrees. The normalized Laplacian reads: $V$ 8 Entry-wise, the $V$ 9 block $E\subseteq 2^V\setminus\{\emptyset\}$ 0 satisfies: $E\subseteq 2^V\setminus\{\emptyset\}$ 1 Specializing parameters and stalk structure recovers classical, magnetic, and previously proposed directed and undirected Laplacians in both the graph and hypergraph setting (Mule et al., 6 Oct 2025, Duta et al., 2023).

The Laplacian is Hermitian, positive semidefinite with spectrum in $E\subseteq 2^V\setminus\{\emptyset\}$ 2, and its quadratic form gives a sheaf-theoretic Dirichlet energy: $E\subseteq 2^V\setminus\{\emptyset\}$ 3 capturing global disagreement across all hyperedges.

3. Network Architecture and Learning

One layer of a Sheaf Hypergraph Network performs spectral diffusion using the (learned) Laplacian: $E\subseteq 2^V\setminus\{\emptyset\}$ 4 where $E\subseteq 2^V\setminus\{\emptyset\}$ 5 collects node features, $E\subseteq 2^V\setminus\{\emptyset\}$ 6 and $E\subseteq 2^V\setminus\{\emptyset\}$ 7 are learnable weights, and $E\subseteq 2^V\setminus\{\emptyset\}$ 8 is a nonlinearity. For real-valued output, the "unwind" operation stacks real and imaginary parts.

Restriction maps $E\subseteq 2^V\setminus\{\emptyset\}$ 9 are parameterized by MLPs applied to concatenations of node and edge features. The charge parameter $e\in E$ 0 in the phase matrix tunes the trade-off between directionality and undirected symmetry.

The DSHNLight variant detaches the Laplacian-assembly computation (freezes MLPs after initialization), allowing efficient training with comparable accuracy at significantly reduced computational cost (Mule et al., 6 Oct 2025).

Alternatively, using a symmetric simplicial set construction, higher-arity and orientation resolution is handled canonically. The resulting Laplacian and sheaf maps are used for neural diffusion in architectures such as Hypergraph Neural Sheaf Diffusion (HNSD), supporting full end-to-end learning and state-of-the-art benchmark performance (Choi et al., 9 May 2025).

4. Extensions: High-Dimensional, Persistent, and Algebraic Sheaf Structures

Sheaf hypergraph networks admit categorical (simplicial set) and algebraic (module, ringed space) generalizations. Lifting a hypergraph $e\in E$ 1 to a symmetric simplicial set $e\in E$ 2, a cellular sheaf $e\in E$ 3 is a functor assigning to each simplex a vector space, with restriction maps for each face inclusion (Choi et al., 2024, Choi et al., 9 May 2025).

The persistent local homology sheaf captures multi-scale topological features by incorporating filtrations over simplex weights and utilizing persistence intervals as canonical bases, enabling the entire network topology (including restriction maps and Laplacians) to become differentiable via backpropagation (Cesa et al., 2023).

For higher-dimensional simplicial complexes, cochain spaces $e\in E$ 4 and sheaf coboundaries $e\in E$ 5 generalize message-passing to $e\in E$ 6-way relationships. The $e\in E$ 7-th sheaf Laplacian $e\in E$ 8 encodes both "up" (coface) and "down" (face) transmission, allowing the architecture to synthesize geometric and combinatorial influences (Hu, 29 May 2025).

5. Empirical Performance and Theoretical Analysis

Sheaf Hypergraph Networks consistently achieve top or near-top accuracy on standard node classification and link prediction benchmarks across real-world citation, co-authorship, communication, and synthetic datasets. In the directed case, principled modeling of asymmetric relations via DSHN yields relative improvements of $e\in E$ 9– $T(e)$ 0 over best baselines on strongly heterophilic and directed benchmarks (e.g., Telegram, Email-Enron) (Mule et al., 6 Oct 2025).

Depth ablations show that — unlike standard HGNNs, where oversmoothing degrades performance beyond 2–3 layers — SHN variants remain stable or even improve up to $T(e)$ 1– $T(e)$ 2 layers, a result attributed to the preservation of sheaf Dirichlet energy in edge stalks (Duta et al., 2023). Increasing stalk dimension $T(e)$ 3 (up to $T(e)$ 4 in practice) augments representational power for heterophilic data, with diminishing returns beyond moderate $T(e)$ 5 (Mule et al., 6 Oct 2025). The DSHNLight approach offers a $T(e)$ 6– $T(e)$ 7 FLOPS/epoch reduction without notable accuracy loss.

Phase parameter ablations ( $T(e)$ 8) reveal that model performance precisely tracks the alignment of hyperedge directions with class structure, and the optimal setting discriminates whether directionality is informative or merely noise (Mule et al., 6 Oct 2025). For persistent and homological sheaf networks, the differentiable encoding of multi-scale local topology yields robust performance on data with complex underlying geometry (Cesa et al., 2023).

6. Relation to Graph, Simplicial, and Other Higher-Order Architectures

Sheaf Hypergraph Networks strictly generalize existing graph, hypergraph, and simplicial convolution operators:

For $T(e)$ 9, $H(e)$ 0, and trivial restriction maps, the Laplacian reduces to the classical hypergraph Laplacian of Zhou et al.; for uniform hypergraph degree $H(e)$ 1, to the standard graph Laplacian (Mule et al., 6 Oct 2025, Duta et al., 2023).
The magnetic Laplacian and sign-magnetic rule are recovered for particular $H(e)$ 2 and sheaf choices, unifying spectral methods in both undirected and directed regimes (Mule et al., 6 Oct 2025).
The categorical and persistent sheaf approaches are compatible with standard constructions in topological data analysis and network modeling, allowing integration with attention, message-passing, and diffusion architectures (Cesa et al., 2023, Hu, 29 May 2025).

These frameworks maintain compatibility with existing Laplacian-based geometric learning pipelines and offer a mathematically principled route to extend GNNs to settings with higher arity, non-symmetric, and topologically intricate relations.

7. Implementation Considerations and Practical Guidance

A canonical implementation requires:

Building the (possibly directed) incidence matrix, assembling block restriction maps (possibly via MLPs), and forming degree and Laplacian operators.
For undirected cases, the basic cellular sheaf Laplacian matches the block formulas in (Choi et al., 2024), and for directed/higher-order, the constructions in (Mule et al., 6 Oct 2025, Choi et al., 9 May 2025).
Efficient differentiation through Laplacian eigenstructure (when needed for learning the sheaf parameters), or alternatively, through matrix-vector products for large-scale and dynamic problems.
For persistent and high-dimensional settings, maintaining efficient representations of the Vietoris–Rips complex or symmetric simplicial set lifts, with associated filtration and persistent homology computations (Hu, 29 May 2025, Cesa et al., 2023).
Scaling to large hypergraphs can exploit detached Laplacian-assembly (DSHNLight), block-sparse representations, or parallelization over hyperedges/nodes.

Sheaf Hypergraph Networks unify, extend, and empirically outperform numerous existing models for higher-order relational learning, offering expressive architectures in both algebraic and geometric regimes (Mule et al., 6 Oct 2025, Duta et al., 2023, Choi et al., 2024, Choi et al., 9 May 2025, Hu, 29 May 2025).