Papers
Topics
Authors
Recent
Search
2000 character limit reached

Topological Graph Pooling (TGPool)

Updated 10 March 2026
  • TGPool refers to pooling operations in GNNs that leverage topological data analysis to preserve global and higher-order graph features.
  • It employs methods like persistent homology, witness complexes, and clique pooling to maintain key structural motifs during graph coarsening.
  • Comparative studies show TGPool enhances graph classification accuracy and interpretability by retaining persistent topological invariants.

Topological Graph Pooling (TGPool) refers to a category of pooling operations for Graph Neural Networks (GNNs) that leverage graph topology, and, in most advanced forms, explicitly incorporate tools from topological data analysis such as persistent homology and simplicial complexes. Unlike vanilla node-wise, attention, or local-scoring approaches, TGPool seeks to preserve global and higher-order topological signatures—such as cycles and connected components—throughout the pooling hierarchy. This results in improved stability, interpretability, and discriminative power, especially in domains where structural motifs are crucial.

1. Foundations and Motivation

Conventional graph pooling methods, whether by global ranking or local assignment, tend to underutilize the complex topological structures inherent in many real-world graphs. Most standard operators focus on local features or community structure, making them prone to losing critical global invariants—such as large rings in molecular graphs or nontrivial cycles in social networks—during coarsening. TGPool approaches seek to bridge this gap by explicitly encoding structural topology into the pooling mechanism, driven by the insight that persistent features (long-lived in a filtration) reflect essential graph-scale properties (Chen et al., 2023, Ying et al., 2024).

Simplicial complexes, persistent diagrams, and related TDA machinery are central in expressing these multiscale topological features, and TGPool adapts these techniques for end-to-end GNN integration.

2. Methodological Variants

Several distinct formulations have been proposed for TGPool, which may be grouped by the nature of topological information used:

  • Persistent Homology-guided Pooling: Methods such as Topology-Invariant Pooling (TIP) use persistence diagrams computed from input or coarsened graphs to steer the node/edge selection. At each pooling layer, a filtration function generates a persistence diagram; edge weights or node clusterings are then adjusted to preserve large-persistence cycles, and a "topological loss" is added to enforce invariance (Ying et al., 2024).
  • Landmark and Witness Complex Pooling: Wit-TopoPool leverages landmark sets to construct approximate witness complexes, efficiently summarizing topology at both local and global scales. Local persistence diagrams are derived from neighborhoods around each node's embedding, scoring local topological salience; global structure is captured by computing the persistent homology on a landmark-based witness complex, yielding a graph-level topological embedding (Chen et al., 2023).
  • Clique-based Pooling: Clique pooling operates by compressing maximal cliques into supernodes, leading to a strongly topological, parameter-free coarsening that systematically reduces graphs while explicitly encoding their higher-order connective structure (Luzhnica et al., 2019).
  • Structural Similarity Pooling: SimPool determines pooling assignments via cosine-similarity on powers of the adjacency matrix, thereby capturing higher-order node roles and role similarity in a purely topological fashion (Shulman, 2020).
  • Spatial Embedding Pooling: Approaches based on spatial graph embeddings (such as DeepWalk) perform node sampling (e.g., via farthest-point-sampling) in the embedding space to ensure global topological feature coverage, with hard or soft assignment for feature aggregation (Rahmani et al., 2019).

3. Formal Framework and Key Algorithms

The following summarizes key TGPool algorithmic constructs:

Persistent Homology Pooling (TIP)

  • Learn a filtration function f:VERf : V \cup E \to \mathbb{R}, map node and edge weights accordingly.
  • Compute persistence diagram D1={(αjb,αjd)}j=1m\mathcal{D}_1 = \{ (\alpha^b_j, \alpha^d_j) \}_{j=1}^m, with persistence pj=αjdαjbp_j = \alpha^d_j - \alpha^b_j.
  • Reweight adjacency: edges sustaining high-persistence cycles are amplified: A()=A()(db)A^{(\ell)} = A'^{(\ell)} \odot (d - b), where d,bd, b are vectors of death and birth times.
  • Minimize a topological loss enforcing vectorized moment matching (means, stds) between original and pooled diagrams: Ltopo=[μ(),σ()][μ(0),σ(0)]22L_{\text{topo}} = \sum_\ell \| [\mu^{(\ell)}, \sigma^{(\ell)}] - [\mu^{(0)}, \sigma^{(0)}] \|_2^2 (Ying et al., 2024).

Wit-TopoPool

  • For each node uu, construct a feature similarity neighborhood, build its Vietoris-Rips complex, and compute local persistence diagram.
  • Assign node score as sum or bounded function over lifespans: yu=ρ(dρbρ)y_u = \sum_\rho (d_\rho - b_\rho) or yu=ρarctan(C(dρbρ)η)y_u = \sum_\rho \arctan(C (d_\rho - b_\rho)^\eta).
  • Select top-K nodes, run GCN on induced subgraph.
  • For global topology, choose a landmark set, compute witness complex, persistence image, and feed to MLP for global representation.
  • Concatenate local and global topological summaries for the final graph embedding (Chen et al., 2023).

Clique Pooling

  • Enumerate maximal cliques via Bron–Kerbosch, greedily assign nodes to the largest available clique.
  • Build the coarsened adjacency by linking supernodes when their underlying cliques are joined in the original graph.
  • Pool features within each supernode via mean or max (Luzhnica et al., 2019).

4. Experimental Evidence and Comparative Analysis

Across multiple benchmarks—chemical (ENZYMES, PROTEINS, DD, NCI1), social (COLLAB, IMDB), and synthetic graphs designed to test topological sensitivity—TGPool methods consistently outperform baselines in both accuracy and preservation of global structure. In "Boosting Graph Pooling with Persistent Homology," the topological preservation, measured by 1-Wasserstein distance between persistence diagrams, improves by an order of magnitude compared to traditional dense poolers. Graph classification accuracy substantially increases (e.g., DiffPool 77.6% \to DiffPool-TIP 83.8% on NCI1; 48.3% \to 65.1% on ENZYMES) (Ying et al., 2024).

Wit-TopoPool yields relative gains of \sim5% over the best runner-up on molecular graphs and 2–3% gains on social graphs (Chen et al., 2023). Clique pooling offers competitive performance without parameter tuning (e.g., 60.7% on ENZYMES, 77.3% on DD), with strong interpretability, and remains competitive even when used to replace standard stride pooling in grid-structured data (e.g., CIFAR-10 as a graph) (Luzhnica et al., 2019).

SimPool demonstrates particular advantage when node attributes are absent or weak, with higher stability and locality preservation than feature-driven pooling (Shulman, 2020).

5. Computational Complexity and Practical Considerations

TGPool approaches differ in computational overhead:

  • Persistent homology computation for 1-dimensional diagrams on sparse graphs is O(mα(m))O(m \alpha(m)) for union-find, where mm is the number of edges, or O(mlogm)O(m \log m) if sorted; higher-dimensional homology is cubic in the number of simplices, requiring approximations (witness complexes) in practice (Ying et al., 2024, Chen et al., 2023).
  • Witness complex construction is O(LlogN)O(|\mathcal{L}| \log N) in the number of landmarks and nodes.
  • Clique enumeration via Bron–Kerbosch is worst-case exponential in NN but is efficient on most real-world graphs (Luzhnica et al., 2019).
  • For SimPool, the dominant costs are O(pnnz(A))O(p\, \text{nnz}(A)) for structural similarity, O(ndμ2)O(n\, d_\mu^2) for similarity computations (where dμd_\mu is mean degree) (Shulman, 2020).

Empirical studies indicate practical scalability for moderate graph sizes, though persistent homology bottlenecks may necessitate landmark-based or approximate schemes for very large graphs.

Key hyper-parameters include filtration thresholds, pooling ratios, landmark set size, persistence weighting exponents, and (for differentiable variants) architecture and learning rates.

6. Interpretability, Limitations, and Theoretical Insights

TGPool methods offer well-defined and interpretable abstraction: each pooled supernode, clique, or complex corresponds to a concrete subgraph or topological motif. Analysis by (Luzhnica et al., 2019) demonstrates that any connected finite graph coarsens to a singleton in finitely many clique pooling steps, and the hierarchical pooling mirrors standard CNN multiscale operations on regular grids.

A core limitation is computational: persistent homology over high-dimensional complexes is costly, and current implementations primarily consider only 1-dimensional homology (cycles). The approximation quality of witness complexes, landmark selection strategy, and the stability of learned filtration functions also influence performance (Chen et al., 2023, Ying et al., 2024).

The non-learned nature of some TGPool variants (e.g., clique, spatial) ensures stability but may restrict adaptivity. Learnable topological pooling (e.g., through differentiable persistence summaries or vectorized PDs) addresses this at the cost of increased model complexity and interpretability.

7. Outlook and Future Directions

Ongoing directions for TGPool research include:

  • Extension to higher-dimensional persistent homology, enabling detection and preservation of cavities (β2\beta_2), voids, and higher-order motifs (Ying et al., 2024).
  • Integration with sparse and edge-drop pooling operators to unify local and global topological invariants.
  • Application to broader domains: 3D point-cloud pooling, link prediction, dynamic graphs, and generative models.
  • Enhanced loss functions: incorporating multi-scale or approximate optimal-transport metrics between persistence diagrams.
  • Analysis of theoretical tradeoffs between expressive power, interpretability, and computational cost, particularly concerning landmark selection and persistent image embeddings.

TGPool exemplifies a principled unification of topological data analysis and GNN hierarchical learning, delivering concrete improvements in both theoretical expressivity and practical task metrics across a variety of graph learning problems (Chen et al., 2023, Ying et al., 2024, Luzhnica et al., 2019, Shulman, 2020, Rahmani et al., 2019).

Topic to Video (Beta)

No one has generated a video about this topic yet.

Whiteboard

No one has generated a whiteboard explanation for this topic yet.

Follow Topic

Get notified by email when new papers are published related to Topological Graph Pooling (TGPool).