Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
140 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Analysing Multiscale Clusterings with Persistent Homology (2305.04281v5)

Published 7 May 2023 in math.AT and cs.LG

Abstract: In data clustering, it is often desirable to find not just a single partition into clusters but a sequence of partitions that describes the data at different scales (or levels of coarseness). A natural problem then is to analyse and compare the (not necessarily hierarchical) sequences of partitions that underpin such multiscale descriptions. Here, we use tools from topological data analysis and introduce the Multiscale Clustering Filtration (MCF), a well-defined and stable filtration of abstract simplicial complexes that encodes arbitrary cluster assignments in a sequence of partitions across scales of increasing coarseness. We show that the zero-dimensional persistent homology of the MCF measures the degree of hierarchy of this sequence, and the higher-dimensional persistent homology tracks the emergence and resolution of conflicts between cluster assignments across the sequence of partitions. To broaden the theoretical foundations of the MCF, we provide an equivalent construction via a nerve complex filtration, and we show that, in the hierarchical case, the MCF reduces to a Vietoris-Rips filtration of an ultrametric space. Using synthetic data, we then illustrate how the persistence diagram of the MCF provides a feature map that can serve to characterise and classify multiscale clusterings.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (47)
  1. Mehmet E. Aktas, Esra Akbas and Ahmed El Fatmaoui “Persistence Homology of Networks: Methods and Applications” In Applied Network Science 4.1 SpringerOpen, 2019, pp. 1–28 URL: https://appliednetsci.springeropen.com/articles/10.1007/s41109-019-0179-3
  2. “Extracting Information from Free Text through Unsupervised Graph-Based Clustering: An Application to Patient Incident Records”, 2019 arXiv: http://arxiv.org/abs/1909.00183
  3. “PyGenStability: Multiscale Community Detection with Generalized Markov Stability”, 2023 arXiv: http://arxiv.org/abs/2303.05385
  4. Jean-Daniel Boissonnat “GUDHI Library”, 2022 URL: https://gudhi.inria.fr/index.html
  5. Jean-Daniel Boissonnat, Tamal K. Dey and Clément Maria “The Compressed Annotation Matrix: An Efficient Data Structure for Computing Persistent Cohomology” In Algorithmica 73.3, 2015, pp. 607–619 URL: https://doi.org/10.1007/s00453-015-9999-4
  6. Béla Bollobás “Random Graphs” Cambridge: Cambridge University Press, 2011
  7. Kyle Brown “Topological Hierarchies and Decomposition: From Clustering to Persistence”, 2022 URL: https://etd.ohiolink.edu/apexprod/rws_olink/r/1501/10?clear=10&p10_accession_num=wright1650388451804736
  8. “HELOC Applicant Risk Performance Evaluation by Topological Hierarchical Decomposition”, 2018 arXiv: http://arxiv.org/abs/1811.10658
  9. Richard A. Brualdi “Introductory Combinatorics” Upper Saddle River, N.J: Pearson/Prentice Hall, 2010
  10. “Approximating Persistent Homology for Large Datasets”, 2022 arXiv: http://arxiv.org/abs/2204.09155
  11. Gunnar Carlsson “Topology and Data” In Bulletin of the American Mathematical Society 46.2, 2009, pp. 255–308 URL: https://www.ams.org/bull/2009-46-02/S0273-0979-09-01249-X/
  12. “Characterization, Stability and Convergence of Hierarchical Clustering Methods” In Journal of Machine Learning Research 11.47, 2010, pp. 1425–1470 URL: http://jmlr.org/papers/v11/carlsson10a.html
  13. “Axiomatic Construction of Hierarchical Clustering in Asymmetric Networks” In 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 2013, pp. 5219–5223 DOI: 10.1109/ICASSP.2013.6638658
  14. Joseph Minhow Chan, Gunnar Carlsson and Raul Rabadan “Topology of Viral Evolution” In Proceedings of the National Academy of Sciences 110.46, 2013, pp. 18566–18571 URL: https://pnas.org/doi/full/10.1073/pnas.1313480110
  15. Frédéric Chazal and Steve Yann Oudot “Towards Persistence-Based Reconstruction in Euclidean Spaces” In Proceedings of the Twenty-Fourth Annual Symposium on Computational Geometry, SCG ’08 New York, NY, USA: Association for Computing Machinery, 2008, pp. 232–241 URL: https://doi.org/10.1145/1377676.1377719
  16. “Gromov-Hausdorff Stable Signatures for Shapes Using Persistence” In Computer Graphics Forum 28.5, 2009, pp. 1393–1403 URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1467-8659.2009.01516.x
  17. Frédéric Chazal, Vin Silva and Steve Oudot “Persistence Stability for Geometric Complexes” In Geometriae Dedicata 173.1, 2014, pp. 193–214 URL: http://link.springer.com/10.1007/s10711-013-9937-z
  18. Jean-Charles Delvenne, Sophia N. Yaliraki and Mauricio Barahona “Stability of Graph Communities across Time Scales” In Proceedings of the National Academy of Sciences 107.29, 2010, pp. 12755–12760 URL: http://www.pnas.org/cgi/doi/10.1073/pnas.0903215107
  19. Tamal K. Dey and Yusu Wang “Computational Topology for Data Analysis” New York: Cambridge University Press, 2022
  20. Tamal K. Dey, Facundo Mémoli and Yusu Wang “Multiscale Mapper: Topological Summarization via Codomain Covers” In Proceedings of the Twenty-Seventh Annual ACM-SIAM Symposium on Discrete Algorithms Society for Industrial and Applied Mathematics, 2016, pp. 997–1013 URL: http://epubs.siam.org/doi/10.1137/1.9781611974331.ch71
  21. “Computational Topology: An Introduction” Providence, R.I: American Mathematical Society, 2010
  22. Herbert Edelsbrunner, David Letscher and Afra Zomorodian “Topological Persistence and Simplification” In Discrete & Computational Geometry 28.4, 2002, pp. 511–533 URL: https://doi.org/10.1007/s00454-002-2885-2
  23. “On Random Graphs I” In Publicationes Mathematicae Debrecen 6, 1959, pp. 290–297 DOI: 10.5486/PMD.1959.6.3-4.12
  24. “Multiscale Methods for Signal Selection in Single-Cell Data” In Entropy 24.8 Multidisciplinary Digital Publishing Institute, 2022, pp. 1116 URL: https://www.mdpi.com/1099-4300/24/8/1116
  25. Paul W. Holland, Kathryn Blackmond Laskey and Samuel Leinhardt “Stochastic Blockmodels: First Steps” In Social Networks 5.2, 1983, pp. 109–137 URL: https://www.sciencedirect.com/science/article/pii/0378873383900217
  26. Danijela Horak, Slobodan Maletić and Milan Rajković “Persistent Homology of Complex Networks” In Journal of Statistical Mechanics: Theory and Experiment 2009.03, 2009, pp. P03034 URL: https://doi.org/10.1088/1742-5468/2009/03/p03034
  27. A.K. Jain, M.N. Murty and P.J. Flynn “Data Clustering: A Review” In ACM Computing Surveys 31.3, 1999, pp. 264–323 URL: https://doi.org/10.1145/331499.331504
  28. “A Topological Representation of Branching Neuronal Morphologies” In Neuroinformatics 16.1, 2018, pp. 3–13 URL: https://doi.org/10.1007/s12021-017-9341-1
  29. Lida Kanari, Adélie Garin and Kathryn Hess “From Trees to Barcodes and Back Again: Theoretical and Statistical Perspectives” In Algorithms 13.12 Multidisciplinary Digital Publishing Institute, 2020, pp. 335 URL: https://www.mdpi.com/1999-4893/13/12/335
  30. “Stochastic Blockmodels and Community Structure in Networks” In Physical Review E 83.1 American Physical Society, 2011, pp. 016107 URL: https://link.aps.org/doi/10.1103/PhysRevE.83.016107
  31. “Extracting Persistent Clusters in Dynamic Data via Möbius Inversion”, 2022 arXiv: http://arxiv.org/abs/1712.04064
  32. Renaud Lambiotte, Jean-Charles Delvenne and Mauricio Barahona “Laplacian Dynamics and Multiscale Modular Structure in Networks”, 2009 arXiv: http://arxiv.org/abs/0812.1770
  33. Renaud Lambiotte, Jean-Charles Delvenne and Mauricio Barahona “Random Walks, Markov Processes and the Multiscale Modular Organization of Complex Networks” In IEEE Transactions on Network Science and Engineering 1.2, 2014, pp. 76–90 URL: http://ieeexplore.ieee.org/document/7010026/
  34. Ulrike Luxburg, Robert C. Williamson and Isabelle Guyon “Clustering: Science or Art?” In Proceedings of ICML Workshop on Unsupervised and Transfer Learning JMLR Workshop and Conference Proceedings, 2012, pp. 65–79 URL: https://proceedings.mlr.press/v27/luxburg12a.html
  35. Jiří Matoušek “Using the Borsuk-Ulam Theorem: Lectures on Topological Methods in Combinatorics and Geometry”, Universitext Berlin ; New York: Springer, 2003
  36. “A Roadmap for the Computation of Persistent Homology” In EPJ Data Science 6.1, 2017, pp. 1–38 URL: https://epjdatascience.springeropen.com/articles/10.1140/epjds/s13688-017-0109-5
  37. Tiago P. Peixoto “Hierarchical Block Structures and High-Resolution Model Selection in Large Networks” In Physical Review X 4.1 American Physical Society, 2014, pp. 011047 URL: https://link.aps.org/doi/10.1103/PhysRevX.4.011047
  38. “Markov Dynamics as a Zooming Lens for Multiscale Community Detection: Non Clique-Like Communities and the Field-of-View Limit” In PLoS ONE 7.2, 2012 URL: https://dx.plos.org/10.1371/journal.pone.0032210
  39. Michael T. Schaub, Renaud Lambiotte and Mauricio Barahona “Encoding Dynamics for Multiscale Community Detection: Markov Time Sweeping for the Map Equation” In Physical Review E 86.2, 2012, pp. 026112 URL: https://link.aps.org/doi/10.1103/PhysRevE.86.026112
  40. Michael T. Schaub, Jiaze Li and Leto Peel “Hierarchical Community Structure in Networks” In Physical Review E 107.5, 2023, pp. 054305 URL: https://link.aps.org/doi/10.1103/PhysRevE.107.054305
  41. Dominik J. Schindler, Jonathan Clarke and Mauricio Barahona “Multiscale Mobility Patterns and the Restriction of Human Movement”, 2023 arXiv: http://arxiv.org/abs/2201.06323
  42. Gurjeet Singh, Facundo Memoli and Gunnar Carlsson “Topological Methods for the Analysis of High Dimensional Data Sets and 3D Object Recognition” In Eurographics Symposium on Point-Based Graphics The Eurographics Association, 2007, pp. 10 pages URL: http://diglib.eg.org/handle/10.2312/SPBG.SPBG07.091-100
  43. “Wasserstein Stability for Persistence Diagrams”, 2022 arXiv: http://arxiv.org/abs/2006.16824
  44. Richard P. Stanley “Enumerative Combinatorics. Volume 1”, Cambridge Studies in Advanced Mathematics 49 Cambridge, NY: Cambridge University Press, 2011
  45. Qingsong Wang “The Persistent Topology of Geometric Filtrations”, 2022 URL: https://etd.ohiolink.edu/apexprod/rws_olink/r/1501/10?p10_etd_subid=196459&clear=10
  46. “Optimal Sankey Diagrams Via Integer Programming” In 2018 IEEE Pacific Visualization Symposium (PacificVis), 2018, pp. 135–139 DOI: 10.1109/PacificVis.2018.00025
  47. “Computing Persistent Homology” In Discrete & Computational Geometry 33.2, 2005, pp. 249–274 URL: https://doi.org/10.1007/s00454-004-1146-y
Citations (1)

Summary

We haven't generated a summary for this paper yet.