Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
169 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

A Normalized Bottleneck Distance on Persistence Diagrams and Homology Preservation under Dimension Reduction (2306.06727v2)

Published 11 Jun 2023 in cs.CG, cs.LG, and math.AT

Abstract: Persistence diagrams (PDs) are used as signatures of point cloud data. Two clouds of points can be compared using the bottleneck distance d_B between their PDs. A potential drawback of this pipeline is that point clouds sampled from topologically similar manifolds can have arbitrarily large d_B when there is a large scaling between them. This situation is typical in dimension reduction frameworks. We define, and study properties of, a new scale-invariant distance between PDs termed normalized bottleneck distance, d_N. In defining d_N, we develop a broader framework called metric decomposition for comparing finite metric spaces of equal cardinality with a bijection. We utilize metric decomposition to prove a stability result for d_N by deriving an explicit bound on the distortion of the bijective map. We then study two popular dimension reduction techniques, Johnson-Lindenstrauss (JL) projections and metric multidimensional scaling (mMDS), and a third class of general biLipschitz mappings. We provide new bounds on how well these dimension reduction techniques preserve homology with respect to d_N. For a JL map f that transforms input X to f(X), we show that d_N(dgm(X),dgm(f(X))) < e, where dgm(X) is the Vietoris-Rips PD of X, and pairwise distances are preserved by f up to the tolerance 0 < \epsilon < 1. For mMDS, we present new bounds for d_B and d_N between PDs of X and its projection in terms of the eigenvalues of the covariance matrix. And for k-biLipschitz maps, we show that d_N is bounded by the product of (k2-1)/k and the ratio of diameters of X and f(X). Finally, we use computational experiments to demonstrate the increased effectiveness of using the normalized bottleneck distance for clustering sets of point clouds sampled from different shapes.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (20)
  1. Dimensionality reduction for k๐‘˜\displaystyle kitalic_k-distance applied to persistent homology. Journal of Applied and Computational Topology, 5:671โ€“691, 2021. doi:10.1007/s41468-021-00079-x.
  2. Topological information retrieval with dilation-invariant bottleneck comparative measures. ArXiv e-prints, 2021. https://arxiv.org/abs/2104.01672. arXiv:2104.01672.
  3. Gunnar Carlsson. Topology and data. Bulletin of the American Mathematical Society, 46(2):255โ€“308, January 2009. doi:10.1090/s0273-0979-09-01249-x.
  4. Computing the Shift-Invariant Bottleneck Distance for Persistence Diagrams. In Proceedings of the Canadian Conference on Computational Geometry, 2018.
  5. The Structure and Stability of Persistence Modules. SpringerBriefs in Mathematics. Springer Cham, 1 edition, 2016.
  6. Persistence stability for geometric complexes. Geometriae Dedicata, 173:193โ€“214, 2014. doi:10.1007/s10711-013-9937-z.
  7. Finding and characterizing tunnels in macromolecules with application to ion channels and pores. Biophysical Journal, 96(2):632โ€“645, 2009.
  8. Multidimensional Scaling. Chapman & Hall/CRC Monographs on Statistics & Applied Probability. CRC Press, 2nd edition, 2000.
  9. Multidimensional Scaling: Approximation and Complexity. In Marina Meila and Tong Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 2568โ€“2578. PMLR, 18โ€“24 Jul 2021. URL: https://proceedings.mlr.press/v139/demaine21a.html.
  10. Computational Topology An Introduction. American Mathematical Society, December 2009.
  11. Persistent Homology: Theory and Practice. Lawrence Berkeley National Laboratory eScholarship, 2013. URL: https://escholarship.org/uc/item/2h33d90r.
  12. Extensions of Lipschitz mappings into a Hilbert space, volumeย 26 of Contemporary Mathematics, pages 189โ€“206. 1984.
  13. Martin Lotz. Persistent homology for low-complexity models. Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences, 475(2230):20190081, 2019. doi:10.1098/rspa.2019.0081.
  14. Nonlinear Dimension Reduction via Outer Bi-Lipschitz Extensions. In Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, pages 1088โ€“1101, New York, NY, USA, 2018. Association for Computing Machinery. doi:10.1145/3188745.3188828.
  15. UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. ArXiv e-prints, September 2020. https://arxiv.org/abs/1802.03426. arXiv:1802.03426.
  16. Donaldย R. Sheehy. The persistent homology of distance functions under random projection. In Proceedings of the Thirtieth Annual Symposium on Computational Geometry, SOCGโ€™14, pages 328โ€“โ€“334, New York, NY, USA, 2014. Association for Computing Machinery. doi:10.1145/2582112.2582126.
  17. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319โ€“2323, 2000.
  18. Laurens van der Maaten and Geoffrey Hinton. Visualizing data using t-SNE. Journal of machine learning research, 9(Nov):2579โ€“2605, 2008.
  19. Improving metric dimensionality reduction with distributed topology, 2021. URL: https://arxiv.org/abs/2106.07613.
  20. Homology-preserving dimensionality reduction via manifold landmarking and tearing. In Proceedings of the Symposium on Visualization in Data Science (VDS) at IEEE VIS, volume 2018, pages 1โ€“9, 2018.
Citations (2)

Summary

We haven't generated a summary for this paper yet.