Topological Neural Networks Overview

Updated 8 February 2026

Topological Neural Networks (TNNs) are neural models that incorporate topological concepts to optimize network structure and connectivity.
They employ differentiable edge weight learning and higher-dimensional message-passing to enhance feature aggregation and mitigate over-squashing.
TNNs improve expressivity and performance through persistent homology augmentation and domain-specific structural adaptations in various applications.

A Topological Neural Network (TNN) is a neural architecture or module whose design or learning dynamics are fundamentally structured, regularized, or formulated in terms of topological concepts. This includes principled treatment of higher-order relations (simplicial, cellular, or combinatorial complexes), explicit optimization or learning of connectivity (topology) within the neural network architecture, augmentation or integration of topological features (e.g., persistent homology), or the imposition of topological priors and constraints for domain-specific expressivity or regularization. The TNN paradigm has advanced both in architectural methodology—such as differentiable topology learning in convolutional backbones—as well as in message-passing, symmetry-aware extensions to non-Euclidean or geometric domains. The following presents a detailed technical overview.

1. Topological Optimization of Neural Network Connectivity

In the architectural variant, the TNN seeks to automatically discover optimal skip-connection patterns within deep convolutional or residual architectures by learning edge connectivity in a complete acyclic stage-graph. Each stage of the network is modeled as a complete directed acyclic graph $G = (V, E)$ , with $N$ ordered computation nodes and directed edges $(i \to j)$ . Each edge is parameterized by a learnable scalar $\alpha_{ij} \ge 0$ which modulates the contribution of node $i$ to $j$ in the transformed feature space. Formally, the node update at $v_j$ takes the shape:

$x_j' = \sum_{i : (i \to j) \in E} \alpha_{ij}\, f(x_i; W_i)\,, \quad x_j = \sigma(x_j')\,,$

where $f(\cdot; W_i)$ is a feature transform (typically convolution-batchnorm-ReLU), and $\sigma$ the nonlinearity.

The total learning objective is

$N$ 0

with sparsity controlled by $N$ 1. All edge weights $N$ 2 are learned jointly with the network weights in a fully differentiable pipeline using standard backpropagation, obviating the need for discrete architecture search or bilevel optimization.

Integration: Existing architectures (ResNet, MobileNet, VGG, etc.) can be "wrapped" by swapping preexisting skip-connection schemes with TNN-optimized blocks.

Performance: Example improvements include ResNet-110 on CIFAR-100 (Top-1 increase: 76.31% $N$ 3 78.54%) and MobileNetV2 on ImageNet (Top-1: 77.61% $N$ 4 78.36%). In object detection (COCO + FPN), AP for ResNet-50 backbone increased from 36.42 to 41.69. The learned topologies are highly sparse—pruning up to 80% of edges post-training often incurs negligible loss when retraining is applied (Yuan et al., 2020).

2. Topological Message-Passing Beyond Graphs

TNNs fundamentally generalize message-passing protocols from ordinary graphs to higher-order topological structures:

Simplicial Complexes: $N$ 5-simplices encode multi-way relations ( $N$ 6 set members), with structure preserved under face-taking.
Cell Complexes: Permit arbitrary-shaped cells (polygons, polyhedra) and richer boundary relations.
Combinatorial Complexes: Admit mixed-rank, heterogeneous, and hierarchical cells.

The canonical TNN message-passing layer updates cell features $N$ 7 via

$N$ 8

where $N$ 9 (lower) and $(i \to j)$ 0 (upper) denote boundary and co-boundary neighborhoods, and COM, AGG, $(i \to j)$ 1 are (per-dimension) parameterized functions, often as MLPs or attention.

Topological attention networks (e.g., Simplicial Attention Networks [SAN], Cell Attention Networks [CAN]) further weight these messages using softmax-leakyReLU attention over both boundary and co-boundary neighbors, enabling anisotropic aggregation and multi-relational inductive bias (Giusti, 2024, Papillon et al., 2023).

3. Expressivity, Bottlenecks, and Theoretical Results

Expressivity: Message-passing TNNs, especially those on simplicial or cellular complexes, are strictly more expressive than 1-WL (Weisfeiler-Lehman) graph neural architectures—encompassing higher-order homotopy and epistatic structures (Giusti, 2024, Papillon et al., 2023).
Over-squashing Mitigation: Standard GNNs' gradients decay exponentially with distance in deep/wide regimes (bottleneck phenomenon). By allowing aggregation and shortcutting over higher-dimensional structures (e.g., triangles, rings), TNNs reduce access (commute) time, mitigate over-squashing, and improve long-range information flow. Theoretical results show a direct relationship between Jacobian obstruction and commute time; rewiring by enhancing algebraic connectivity or adding higher-order adjacencies directly targets bottlenecks (Giusti, 2024).
Universal Approximation: TNNs parameterized over general Tychonoff spaces and functions densely approximate any uniformly continuous function therein (strong UAT). When inputs are distributions (Borel measures), TNNs recover and generalize the Deep Sets framework, but extend it from finite multisets to arbitrary positive-finite measures and arbitrary metric input domains (Kouritzin et al., 2023).
Persistent Homology Augmentation: Integrating persistent homology (PH) as a differentiable vectorization into TNNs (e.g., PersLay, RePHINE, TOGL, TopNets) provably increases expressivity, enabling TNNs to distinguish structures indistinguishable by message-passing schemes alone. The vectorization step involves aggregation (sum, mean, or learned) of image, landscape, or kernel transforms over birth–death pairs of PH diagrams (Verma et al., 2024, Papillon et al., 2023).

4. Domain-Specific Variants and Physical Implementations

Cosmology: Message-passing TNNs on combinatorial complexes (including tetrahedra, clusters) have enabled substantial improvements in cosmological parameter inference, reducing mean squared error by up to 60% on Quijote simulation benchmarks relative to GNNs. Critical to performance are higher-rank message-passing and $(i \to j)$ 2 invariance in featurization and aggregation (Lee et al., 29 May 2025).
Neuromorphic Physics: Topological Mechanical Neural Networks (TMNNs), realized as quantum spin Hall–inspired spring–mass lattices with topologically protected edge states, implement classification via damage-tolerant pseudospin channel propagation. Training is achieved via in situ backpropagation—a local rule exploiting the adjoint of the mechanical system's dynamical matrix. These systems exhibit high robustness, parallelizability (frequency-division multiplexing), and generalize to multiple topological phases (Li et al., 10 Mar 2025).
Visual Topography: All-Topographic Neural Networks (All-TNNs) abandon convolutional weight sharing, learning distinct local receptive-field kernels arranged on a 2D cortical sheet, regularized via pairwise smoothness. These models recapitulate V1-like orientation maps, cortical magnification, and IT-like category patches, and show improved alignment with empirical human object recognition biases (Lu et al., 2023).

5. Implementation Methodologies and Training Algorithms

TNN design and integration span:

Connectivity Learning: Assign edge weights $(i \to j)$ 3 over all possible pairs in a stage; initialize to 1.0, jointly optimize with standard network parameters using SGD on

$(i \to j)$ 4

Optionally, threshold the learned topology post hoc (retaining $(i \to j)$ 5) for a pruned, truly sparse DAG.

End-to-End Message Passing: For each cell type across all relevant ranks, aggregate lower, upper, and adjacently linked neighbors' messages, updating with permutation-invariant AGG and learnable combination.
Differentiable Topological Layers: Map persistence diagrams to vectors (e.g., persistence images) using grid-based Gaussian smoothing or inner-product computations (nonparametric). Gradient flows through barcodes (when permitted by the homology solver), or as features concatenated to neural inputs (Zhao, 2021).
Communication-Aware TNNs: In decentralized settings (e.g., AirTNN), localized message-exchange is realized via analog, uncoded wireless transmissions, modeling topological convolutions as multi-hop shifts affected by channel fading and noise. Over-the-air implementations incorporate channel statistics in both training and inference (Fiorellino et al., 14 Feb 2025).

6. Limitations, Challenges, and Future Directions

Current TNN methodologies entail computational challenges:

Scalability: Managing state across millions of cells in real-world complexes requires efficient dynamic storage and neighborhood computation, especially with dynamic rewiring.
Over-smoothing: Deep TNNs may collapse feature diversity; initial explorations with skip connections and normalization are ongoing.
Expressivity vs. Efficiency: Tucker low-rank representations in tensorial TNNs decrease sample complexity but may underfit if the true underlying complexity exceeds the rank budget (Wen et al., 2024).
Empirical Gaps: Higher-dimensional complexes (beyond triangles/tetrahedra), dynamic/temporal networks, and rigorous unification across hypergraphs, simplicial and cellular frameworks are evolving research topics (Papillon et al., 2023, Hajij et al., 27 May 2025).
Physical Realization: Mechanical and neuromorphic TNNs show promise for parallel and damage-tolerant learning but require further work on nonlinear/temporal physical phenomena (Li et al., 10 Mar 2025).

Compositional and category-theoretic frameworks (e.g., copresheaf TNNs) offer unified perspectives—subsuming GNNs, CNNs, sheaf and manifold nets—by replacing the global latent space with local stalks and functorial transport maps. This multi-scale, directionally-parameterized approach enables topologically-biased, hierarchical, and anisotropic learning on arbitrary structured domains (Hajij et al., 27 May 2025).

7. Summary Table: Core TNN Classes and Benchmarks

Class/Methodology	Topological Domain	Core Mechanism	Empirical/Grounded Advantage
Stage-topology learning	Neural net layer graph	Differentiable edge weights	+2–5% classification AP/CIFAR
Simplicial/Cellular TNN	Simplicial/Cell complex	Higher-order MP, attention	Bottleneck mitigation, express.
PH-augmented TNN	G/SC, any	Persistent homology layers	SOTA molecular/social graphs
Physics/neuromorphic TNN	Mechanical lattice	Topological modes, in situ BP	Robust, hardware-integrable
All-TNN (visual)	2D cortical lattice	No weight sharing, top. reg.	Human-aligned spatial maps

Increasing evidence from both synthetic and applied domains demonstrates that topological inductive bias and dynamical optimization of network topology can substantially alleviate fundamental limitations of traditional neural architectures and open new application regimes (Yuan et al., 2020, Giusti, 2024, Papillon et al., 2023).