Extended Topological Pseudodistances (ETD)
- Extended Topological Pseudodistances (ETD) are advanced pseudo-metrics that extend classical measures to multi-dimensional persistent homology, accommodating torsion and multi-field data.
- They integrate algebraic formulations, directional extended persistence, and projection-based techniques to achieve computational efficiency and robust stability guarantees.
- ETDs are applied in shape matching, scientific data analysis, and machine learning to deliver near-classical Wasserstein accuracy at significantly reduced computational cost.
Extended Topological Pseudodistances (ETD) refer to a family of pseudo-metrics developed for the stable and efficient comparison of topological invariants arising in persistent homology, multi-field topology, and shape analysis. They generalize or extend classical distances such as the bottleneck and -Wasserstein metrics on persistence diagrams, as well as matching distances for persistent modules, to accommodate multi-parameter filtrations, torsion in homology, multi-field data, and computational constraints. ETDs encapsulate a spectrum of constructions, from algebraically-defined metrics in multidimensional persistence modules to directionally-averaged distances integrating extended persistence and fast vectorization-based pseudometrics.
1. Foundational Definitions and Variants
ETDs arise in multiple, interrelated formulations, notably:
- The algebraic ETD for multidimensional persistent homology groups with arbitrary Abelian coefficients, including torsion. For compact spaces , with filtering functions , is defined by considering surjective homomorphisms between persistent homology groups, smeared by a uniform perturbation across parameter space. Specifically,
$d_T\bigl(H_k^{(X,\varphi)},H_k^{(Y,\psi)}\bigr) = \inf \left\{\epsilon \geq 0 : \forall (\vec u, \vec v)\in\Delta^+, \; \text{there exist surjective homomorphisms between certain subgroups at $(\vec u, \vec v)(\vec u-\vec\epsilon, \vec v+\vec\epsilon)$}\right\}$
as precisely described in (Frosini, 2010).
- The Extended Pseudometric from the Extended Persistent Homology Transform (XPHT) framework for shapes embedded in Euclidean space, where each unit direction yields an extended persistence module of a height function, and the ETD is the integral of -Wasserstein distances between extended modules across directions:
as formalized in (Turner et al., 2022).
- The multi-field ETD based on hierarchically-constructed Multi-Dimensional Reeb Graphs (MDRGs) and their associated persistence diagrams, matching across dimensions and levels via optimal bijections and bottleneck distances, see (Ramamurthi et al., 2023).
- The scalable ETD0 for persistence diagrams: given a family of projection directions 1, and for each homology degree, one projects points in the diagrams (and appends diagonal projections), computes the 1D Wasserstein distance in each projection, and aggregates over 2 and dimensions in an 3 fashion:
4
with
5
as analyzed in (Nuñez et al., 2024).
2. Theoretical Properties and Connections
2.1. Pseudometric Structure
Across the main variants, ETDs satisfy nonnegativity and symmetry, and obey the triangle inequality by construction. The identity of indiscernibles typically fails: ETDs can assign distance zero to non-identical topological data, due to coarseness (as in XPHT, if two shapes' transforms coincide in all directions) or diagram-level degeneracies (distinct Reeb graphs with identical persistence diagrams).
2.2. Stability and Approximations
- 6 is stable under perturbations of the filtering functions, with the quantitative bound:
7
as shown in Corollary 3.8 of (Frosini, 2010).
- All examined ETDs yield Lipschitz-continuous responses to small topological changes. For example, in the MDRG construction:
8
for bivariate fields, as in (Ramamurthi et al., 2023).
- For the projection-based ETD9, as the number of directions 0 increases, ETD1 converges (in the 2 limit) to the Sliced-Wasserstein distance. For finite 3, tight continuity and Lipschitz-type bounds are inherited from persistence diagram theory, up to multiplicative constants depending on cardinality 4, the chosen 5, and the size of the diagrams (Nuñez et al., 2024).
2.3. Generalization Power
- 6 subsumes both the natural pseudo-distance (in shape matching) and the matching distance (for 1D filtrations over a field), but crucially extends to settings where classical invariants fail, such as torsion coefficients or multidimensional filtration parameters (Frosini, 2010).
- XPHT-based ETDs avoid infinite distances in cases where Betti numbers differ, by using extended persistence (essential classes appear and die at finite intervals in the module) (Turner et al., 2022).
3. Algorithmic Strategies and Complexity
| ETD Method | Key Step | Time Complexity |
|---|---|---|
| 7 on multidim PH | Algebraic subgroup/surjection checking | Unspecified, no known polytime |
| XPHT-based ETD | Directional extended PH and integration | 8 per shape |
| MDRG ETD | MDRG + PD + assignment matching | 9 |
| Projection ETD0 | 1D projections, sorting, 1 aggr. | 2 |
For projection ETDs, increasing 3 (number of projections) interpolates between the computational extremes of cheap, non-injective statistics and the full Sliced-Wasserstein metric. Even for large diagrams (4 points), ETD5 with small 6 runs in milliseconds per pair, while exact 7 matching scales as 8 (Nuñez et al., 2024).
For XPHT-based ETDs in image settings, boundary extraction dominates (complexity 9 for 0 images), followed by 1 per direction for PH and local tests. Pairwise distances in a dataset of 2 shapes require 3 (Turner et al., 2022).
4. Extension to Multi-Field and High-Dimensional Data
The ETD for Multi-Dimensional Reeb Graphs (MDRG) is constructed by recursively building a hierarchy of Reeb graphs for each field component, associating persistence diagrams at every node, and matching at each level using bottleneck distances and optimal bijections. This enables capturing the topological complexity of multi-field data and supports robust classification and scientific analysis tasks, as detailed in (Ramamurthi et al., 2023).
The ETD4 construction for persistence diagrams generalizes this directionally: it projects diagrams onto lines in the plane at various angles, compares the resulting 1D diagrams using the Wasserstein metric, and aggregates the results, enabling tunable accuracy and efficiency. As 5, the distance converges to the Sliced-Wasserstein metric, which itself approximates the full Wasserstein. For small 6, computational performance approaches that of persistence statistics, but with improved discriminativity (Nuñez et al., 2024).
5. Applications and Empirical Performance
ETDs have demonstrated effectiveness across several domains:
- Shape Matching and Classification: MDRG- and XPHT-based ETDs substantially outperform scalar topological summaries and histogram-based graph distances in discriminating nontrivial classes in both 2D and 3D shape datasets (Ramamurthi et al., 2023, Turner et al., 2022).
- Scientific Data Analysis: Multi-field ETDs enable discrimination of time-varying phenomena in multi-orbital chemistry simulations, outperforming scalar approaches in localizing critical bonding events with high accuracy (ROC-AUC up to 0.98) (Ramamurthi et al., 2023).
- Persistence Diagram Analysis in ML: Fast ETD7 distances yield near-classical Wasserstein accuracy in supervised 8-NN tasks on Outex, SHREC07, and Fashion-MNIST, at computational costs orders of magnitude lower than exact diagram matching. For autoencoder topology, they track nontrivial 9, 0, 1 changes missed by persistence statistics (Nuñez et al., 2024).
- Font Image Classification: XPHT-based ETD can robustly disambiguate subtle topological and geometric differences (e.g., serif versus sans-serif letters) in collections of real-world binary images (Turner et al., 2022).
6. Interpretative and Practical Aspects
ETDs provide stability, scalability, and flexibility not accessible to classical topological distances alone:
- Stability: All ETDs reviewed exhibit explicit Lipschitz continuity or tight sensitivity to perturbations in input data, both in algebraic and geometric settings.
- Torsion Sensitivity: Algebraic ETDs (e.g., 2) remain the only constructions to provably account for torsion phenomena, essential in multidimensional persistent homology (Frosini, 2010).
- Computational Control: Projection-based ETDs offer a principled, tunable trade-off between accuracy and efficiency: small projection sets optimize speed, larger sets increasingly approximate Sliced-Wasserstein limits.
- Practical Algorithmics: Efficient implementations benefit from batched floating-point operations and careful angle precomputation. For images and 2D shapes, Morse-theoretic decompositions and boundary-based computations compound these advantages (Turner et al., 2022, Nuñez et al., 2024).
A plausible implication is that ETDs provide a versatile toolbox for applied topological data analysis, capable of adapting to the specific constraints of algebraic complexity, data size, or the underlying scientific application.
7. Limitations and Outlook
ETDs, while versatile, are subject to structural non-injectivity—they are pseudodistances, not strict metrics, and can equate distinct objects with coincident topological summaries. In algebraically defined constructions (e.g., 3), no general polynomial-time algorithms for exact computation are known due to the inherent complexity of recognizing surjective homomorphism relationships among persistent homology subgroups. Conversely, for projection-based ETDs and MDRG-ETDs, explicit parameter tuning (e.g., number of projections, quantization slabs) is required for optimal performance, and accuracy depends on the granularity of the discretization.
Future research is suggested to focus on further efficiency gains, tighter injectivity properties, and the extension of ETDs to even higher-dimensional or more exotic data modalities, consolidating their role as primary instruments for stable, scalable, and meaningful topological comparison across applied science and data analysis domains (Frosini, 2010, Turner et al., 2022, Ramamurthi et al., 2023, Nuñez et al., 2024).