Persistent Homology on Activations

Updated 6 May 2026

Persistent Homology on Activations is a framework that applies algebraic topology to analyze neural activation patterns and uncover multiscale topological structures.
The methodology employs activation graphs, polyhedral decompositions, and point-cloud analyses to construct filtrations and compute persistence diagrams that track topological changes layerwise.
This approach informs neural representation theory and network robustness by identifying information pathways, layerwise topological thinning, and susceptibilities to adversarial perturbations.

Persistent homology on activations is a methodology that applies tools from algebraic topology—specifically, persistent homology—to the study of activation patterns, pathways, and intermediate representations in neural networks. It characterizes the multiscale topological structure of activations, providing a rigorous approach for understanding distributed representations, layerwise transformations, polyhedral decompositions, and their implications for robustness and representation learning. This perspective supplements classical geometric and information-theoretic analyses, revealing how neural architectures transform the topology of data and highlighting the organizational principles and vulnerabilities in deep learning.

1. Formalism: Activation Graphs, Polyhedra, and Simplicial Complexes

Three principal activation-to-topology correspondences underpin persistent homology on activations:

Activation Graphs (Layerwise Graphical Structures):

For a fixed input $\mathcal{I}$ and feed-forward architecture with $L$ layers, one induces the activation graph $G^\mathcal{I} = (V, E, \varphi)$ , where: - $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ , the neurons across all layers; - $E = \{(u,v) : u \in V_l, v \in V_{l+1}\}$ , connecting neurons across adjacent layers; - $\varphi: E \to \mathbb{R}_+$ , edge-weights given by $\varphi(u,v) = |w_{u \rightarrow v} h_u|$ , with $h_u$ the activation of $u$ and $w_{u \rightarrow v}$ its outgoing weight. The adjacency matrix $L$ 0 records $L$ 1 or $L$ 2 as appropriate (Gebhart et al., 2019).

Polyhedral Decompositions by Piecewise-Linear Activations (e.g., ReLU): ReLU networks induce hyperplane arrangements partitioning input space into convex polyhedra $L$ 3, each associated with a distinct global activation code. On each $L$ 4, the network is affine; the collection forms a polyhedral decomposition. The dual graph $L$ 5 of this decomposition has vertices indexed by codewords, with edges joining codewords that differ in only one neuron and induce adjacent polyhedra (Liu et al., 2023, Beshkov, 3 Feb 2025).
Activation Point Clouds for Vietoris–Rips Complexes: For a batch or data cloud $L$ 6 at any given layer, the corresponding activations $L$ 7 (where $L$ 8 is layer width) are embedded as a point cloud. Pairwise (often Euclidean or intrinsic) distances define metric filtrations—typically Vietoris–Rips complexes (Wheeler et al., 2021, Naitzat et al., 2020, Shahidullah, 2022).

2. Persistent Homology Pipeline: Filtration, Computation, and Summarization

The persistent homology pipeline for activation analysis comprises:

Filtration Construction:
- Activation Graphs:
Edges are sorted by $L$ 9 in nonincreasing order, giving a superlevel-set filtration $G^\mathcal{I} = (V, E, \varphi)$ 0 (edges $G^\mathcal{I} = (V, E, \varphi)$ 1). Each $G^\mathcal{I} = (V, E, \varphi)$ 2 is a simplicial complex (vertices and edges) (Gebhart et al., 2019). - Polyhedral Dual Graphs:

For data points $G^\mathcal{I} = (V, E, \varphi)$ 3, compute corresponding binary codewords $G^\mathcal{I} = (V, E, \varphi)$ 4; define pairwise Hamming, graph-geodesic, or other combinatorial metrics to build filtration complexes—Vietoris–Rips, cliques, or order complexes (Liu et al., 2023). - Activation Point Clouds:

Apply standard distance-based filtrations, typically Euclidean Vietoris–Rips, to the activation vectors of each layer (Wheeler et al., 2021, Naitzat et al., 2020, Shahidullah, 2022).
Persistent Homology Computation:
- Apply standard matrix-reduction or union-find for $G^\mathcal{I} = (V, E, \varphi)$ 5 (connected components) in $G^\mathcal{I} = (V, E, \varphi)$ 6 (inverse Ackermann), or use PH software (Ripser, Perseus, PHAT) for higher homology (Gebhart et al., 2019, Liu et al., 2023).
- Each homology class is tracked by its birth and death indices, producing persistence diagrams $G^\mathcal{I} = (V, E, \varphi)$ 7, barcodes, and Betti curves:
$G^\mathcal{I} = (V, E, \varphi)$ 8

Summaries include Wasserstein/bottleneck distances, landscapes ( $G^\mathcal{I} = (V, E, \varphi)$ 9), images, and silhouettes (Gebhart et al., 2019, Wheeler et al., 2021).

3. Layerwise, Pathwise, and Relative Topological Signatures

Multiple representations can be analyzed:

Persistent Subgraphs (0-cycles) as “Information Pathways”: Long-lived $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 0 classes correspond to globally relevant signal pathways from input pixels to output neurons; their lifetimes quantify pathway globality (Gebhart et al., 2019).
Layerwise Evolution of Betti Numbers (Topological Simplification):
- In both simulated and real data, Betti numbers $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 1—especially $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 2 and $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 3—fall sharply across layers, particularly for ReLU activation, indicating "topological thinning" as data manifolds are unfolded and collapsed to linearly separable representatives (Naitzat et al., 2020, Shahidullah, 2022).
- With bijective nonlinearities (e.g., Tanh), topological features (e.g., loops) persist longer through more layers; non-bijective ReLU mappings rapidly destroy higher homology, consistent with their nonhomeomorphic behavior (Shahidullah, 2022).
Relative Homology via Polyhedral Overlap Decomposition:
- The quotient of the input manifold $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 4 by the network-induced "overlap" decomposition $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 5 yields relative-homology groups $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 6, allowing metric-independent computation of topological invariants under contractibility assumptions on the polyhedral cells. These capture purely topological (not geometric) deviations in representation as training progresses (Beshkov, 3 Feb 2025).

4. Empirical Findings: Robustness, Adversarial Examples, and Representation Analysis

Adversarial Example Anatomy:
- Persistent homology of activation graphs reveals that adversarial perturbations, such as PGD or CW $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 7 attacks, do not create new topological features associated with target classes but instead subtly reroute dominant activation pathways, resulting in class misclassification even as activation subgraph topology remains similar (Gebhart et al., 2019).
- For a range of architectures, subgraph-based SVMs trained on persistence-weighted templates robustly recover the correct class for adversarial instances (70–83% accuracy), even where the base network fails on 100% of such inputs. Wasserstein distances between diagram pairs correlate with noise/perturbation magnitude (Gebhart et al., 2019).
Manifold Hypothesis and Layerwise Homeomorphism:
- In early layers, network maps act as approximate homeomorphisms, preserving the input manifold’s Betti numbers, while deeper layers collapse non-essential topology for classification (Shahidullah, 2022, Naitzat et al., 2020).
- Deep architectures distribute topological changes more evenly across layers; shallow ones concentrate topological alterations in later stages (Naitzat et al., 2020).
Polyhedral and Graph-based Persistence is Stable and Scalable:
- Graph-based persistence using combinatorial metrics (Hamming, graph-geodesic) on activation codes converges (in bottleneck distance) to the true manifold homology with increased sampling, providing practical and robust large-scale analysis (Liu et al., 2023).

5. Comparative Methodologies, Limitations, and Extensions

Approach	Object	Metric/Complex	Uniqueness
Activation Graph PH (Gebhart et al., 2019)	Graph $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 8	Edge-weight	Encodes explicit pathway structure
Polyhedral Dual Graph PH (Liu et al., 2023)	Bit-code graph $V = V_0 \cup V_1 \cup \ldots \cup V_{L-1}$ 9	Hamming/geodesic	Encodes activation cell transitions
Point-cloud VR PH (Wheeler et al., 2021, Naitzat et al., 2020)	Activation vectors	Euclidean	Layerwise geometric distortion
Relative Homology (Beshkov, 3 Feb 2025)	Overlap classes $E = \{(u,v) : u \in V_l, v \in V_{l+1}\}$ 0	None	Purely topological equivalence

Choice of filtration and metric influences sensitivity: Graph-based methods avoid geometric artifacts, while Euclidean VR-PH is sensitive to sampling density and curvature (Beshkov, 3 Feb 2025).
Limitations: Polyhedral explosion (exponential in neurons/layers), convexity requirements for relative homology, and computational costs for large-scale data constrain feasible domains (Beshkov, 3 Feb 2025, Liu et al., 2023).
Extensions: Multimetric and multiparameter filtrations across resolutions; local, per-neuron, or stratified homology to probe finer representational features; integration with classical interpretability approaches (Wheeler et al., 2021, Liu et al., 2023).

6. Significance for Neural Representation Theory and Robustness

Persistent homology on activations offers principled, multiscale measurement of neural architectures’ effect on data topology, with ramifications for robustness, interpretability, and the structure of representation spaces. Topological simplification induced by architecture and activation nonlinearity may explain observed differences in network generalization. The framework exposes how class-relevant “information pathways” can be rerouted or destabilized by small input perturbations, accounting for adversarial vulnerability not as the introduction of novel class features but as subtle deformation within the existing subgraph topology (Gebhart et al., 2019). Quantifying these effects via persistent homology enables more targeted architectural design, informed robustness regularization, and in-depth comparison of learned representations across models and tasks.