Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
162 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
45 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Persistent Homology for High-dimensional Data Based on Spectral Methods (2311.03087v3)

Published 6 Nov 2023 in cs.LG and math.AT

Abstract: Persistent homology is a popular computational tool for analyzing the topology of point clouds, such as the presence of loops or voids. However, many real-world datasets with low intrinsic dimensionality reside in an ambient space of much higher dimensionality. We show that in this case traditional persistent homology becomes very sensitive to noise and fails to detect the correct topology. The same holds true for existing refinements of persistent homology. As a remedy, we find that spectral distances on the k-nearest-neighbor graph of the data, such as diffusion distance and effective resistance, allow to detect the correct topology even in the presence of high-dimensional noise. Moreover, we derive a novel closed-form formula for effective resistance, and describe its relation to diffusion distances. Finally, we apply these methods to high-dimensional single-cell RNA-sequencing data and show that spectral distances allow robust detection of cell cycle loops.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (77)
  1. Persistence homology of networks: methods and applications. Applied Network Science, 4(1):1–28, 2019.
  2. DTM-based filtrations. In Topological Data Analysis: The Abel Symposium 2018, pages 33–66. Springer, 2020.
  3. Comprehensive single cell mRNA profiling reveals a detailed roadmap for pancreatic endocrinogenesis. Development, 146(12):dev173849, 2019.
  4. U. Bauer. Ripser: efficient computation of Vietoris–Rips persistence barcodes. Journal of Applied and Computational Topology, 5(3), 2021.
  5. M. Belkin and P. Niyogi. Laplacian eigenmaps and spectral techniques for embedding and clustering. In Advances in Neural Information Processing Systems, volume 14, pages 585–591, 2002.
  6. Improving homology estimates with random walks. Inverse Problems, 27(12):124002, 2011.
  7. Comprehensive cell atlas of the first-trimester developing human brain. BioRxiv, pages 2022–10, 2022.
  8. Coars raining o ata vi nhomogeneou iffusio ondensation. In 2019 IEEE International Conference on Big Data (Big Data), pages 2624–2633. IEEE, 2019.
  9. Hierarchical density estimates for data clustering, visualization, and outlier detection. ACM Transactions on Knowledge Discovery from Data (TKDD), 10(1):1–51, 2015.
  10. Perslay: A neural network layer for persistence diagrams and new graph topological signatures. In International Conference on Artificial Intelligence and Statistics, pages 2786–2796. PMLR, 2020.
  11. T. Chari and L. Pachter. The specious art of single-cell genomics. PLOS Computational Biology, 19(8):e1011288, 2023.
  12. Kernel operations on the GPU, with autodiff, without memory overflows. Journal of Machine Learning Research, 22(74):1–6, 2021. URL http://jmlr.org/papers/v22/20-275.html.
  13. F. Chazal and B. Michel. An introduction to topological data analysis: fundamental and practical aspects for data scientists. Frontiers in Artificial Intelligence, 4:108, 2021.
  14. Geometric inference for measures based on distance functions. Foundations of Computational Mathematics, 11(6):733–751, 2011.
  15. The manifold scattering transform for high-dimensional point cloud data. In Topological, Algebraic and Geometric Learning Workshops 2022, pages 67–78. PMLR, 2022.
  16. F. R. Chung. Spectral graph theory, volume 92. American Mathematical Soc., 1997.
  17. Stability of persistence diagrams. In Proceedings of the twenty-first annual symposium on Computational geometry, pages 263–271, 2005.
  18. R. R. Coifman and S. Lafon. Diffusion maps. Applied and Computational Harmonic Analysis, 21(1):5–30, 2006.
  19. D. Damm. Core-distance-weighted persistent homology and its Behavior on Tree-Shaped Data. Master’s thesis, Heidelberg University, 2022.
  20. S. Damrich and F. A. Hamprecht. On UMAP’s true loss function. In Advances in Neural Information Processing Systems, volume 34, pages 5798–5809, 2021.
  21. Random Walks and Electric Networks, volume 22. American Mathematical Soc., 1984.
  22. Topological persistence and simplification. Discrete & Computational Geometry, 28:511–533, 2002.
  23. Intrinsic persistent homology via density-based metric learning. Journal of Machine Learning Research, 24(75):1–42, 2023.
  24. E. Flores-Bautista and M. Thomson. Unraveling cell differentiation mechanisms through topological exploration of single-cell developmental trajectories. bioRxiv, pages 2023–07, 2023.
  25. Algorithms and Models for Network Data and Link Analysis. Cambridge University Press, 2016.
  26. Toroidal topology of population activity in grid cells. Nature, 602(7895):123–128, 2022.
  27. F. Göbel and A. Jagers. Random walks on graphs. Stochastic processes and their applications, 2(4):311–336, 1974.
  28. Minimum spanning trees and single linkage cluster analysis. Journal of the Royal Statistical Society: Series C (Applied Statistics), 18(1):54–64, 1969.
  29. Nonhomogeneous Euclidean first-passage percolation and distance learning. Bernoulli, 28(1):255–276, 2022.
  30. Diffusion maps for high-dimensional single-cell analysis of differentiation data. Bioinformatics, 31(18):2989–2998, 2015.
  31. Diffusion pseudotime robustly reconstructs lineage branching. Nature Methods, 13(10):845–848, 2016.
  32. Visual detection of structural changes in time-varying graphs using persistent homology. In 2018 IEEE Pacific Visualization Symposium (pacificvis), pages 125–134. IEEE, 2018.
  33. A. Hatcher. Algebraic Topology. Cambridge University Press, 2002.
  34. Uncovering 2-D toroidal representations in grid cell ensemble activity during 1-D behavior. bioRxiv, pages 2022–11, 2022.
  35. G. E. Hinton and S. Roweis. Stochastic neighbor embedding. In Advances in Neural Information Processing Systems, volume 15, pages 857–864, 2002.
  36. Curse of dimensionality on persistence diagrams. arXiv preprint arXiv:2404.18194, 2024.
  37. The malaria cell atlas: Single parasite transcriptomes across the complete plasmodium life cycle. Science, 365(6455):eaaw2619, 2019.
  38. Topology-preserving deep image segmentation. Advances in Neural Information Processing Systems, 32, 2019.
  39. J. Jia and L. Chen. Single-cell RNA sequencing data analysis based on non-uniform ε𝜀\varepsilonitalic_ε- neighborhood network. Bioinformatics, 38(9):2459–2465, 2022.
  40. Single-cell analysis reveals inflammatory interactions driving macular degeneration. Nature Communications, 14(1):2589, 2023.
  41. Human gait identification using persistent homology. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications: 17th Iberoamerican Congress, CIARP 2012, Buenos Aires, Argentina, September 3-6, 2012. Proceedings 17, pages 244–251. Springer, 2012.
  42. Deep generative modeling for single-cell transcriptomics. Nature Methods, 15(12):1053–1058, 2018.
  43. L. Lovász. Random walks on graphs. Combinatorics, Paul erdos is eighty, 2(1-46):4, 1993.
  44. Topological analysis of interaction patterns in cancer-specific gene regulatory network: Persistent homology approach. Scientific Reports, 11(1):16414, 2021.
  45. UMAP: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426, 2018.
  46. F. Mémoli. personal communication, 2023.
  47. Persistent Laplacians: properties, algorithms and implications. SIAM Journal on Mathematics of Data Science, 4(2):858–884, 2022.
  48. Visualizing structure and transitions in high-dimensional biological data. Nature Biotechnology, 37(12):1482–1492, 2019.
  49. Cell cycle controls long-range calcium signaling in the regenerating epidermis. Journal of Cell Biology, 222(7):e202302095, 2023.
  50. Determining clinically relevant features in cytometry data using persistent homology. PLoS Computational Biology, 18(3):e1009931, 2022.
  51. Persistent homology of coarse-grained state-space networks. Physical Review E, 107(3):034303, 2023.
  52. Topology of deep neural networks. Journal of Machine Learning Research, 21(1):7503–7542, 2020.
  53. Identifying transient cells during reprogramming via persistent homology. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC), pages 2920–2923. IEEE, 2022.
  54. Networks and cycles: a persistent homology approach to complex networks. In Proceedings of the European Conference on Complex Systems 2012, pages 93–99. Springer, 2013.
  55. openTSNE: a modular python library for t-SNE dimensionality reduction and embedding. bioRxiv, 2019.
  56. Single-cell RNA-seq reveals hidden transcriptional variation in malaria parasites. elife, 7:e33105, 2018.
  57. B. Rieck and H. Leitte. Agreement analysis of quality measures for dimensionality reduction. In Topological Methods in Data Analysis and Visualization IV: Theory, Algorithms, and Applications VI, pages 103–117. Springer, 2017.
  58. Neural persistence: A complexity measure for deep neural networks using algebraic topology. In International Conference on Learning Representations, 2018.
  59. Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nature Biotechnology, 35(6):551–560, 2017.
  60. M. D. Robinson and A. Oshlack. A scaling normalization method for differential expression analysis of RNA-seq data. Genome Biology, 11(3):1–9, 2010.
  61. Nonlinear dimensionality reduction by locally linear embedding. Science, 290(5500):2323–2326, 2000.
  62. The transcriptome dynamics of single cells during the cell cycle. Molecular Systems Biology, 16(11):e9946, 2020.
  63. Topological methods for the analysis of high dimensional data sets and 3D object recognition. PBG@ Eurographics, 2:091–100, 2007.
  64. P. Smith and V. Kurlin. Families of point sets with identical 1d persistence. arXiv:2202.00577, 2022.
  65. A global geometric framework for nonlinear dimensionality reduction. Science, 290(5500):2319–2323, 2000.
  66. Scale-variant topological information for characterizing the structure of complex networks. Physical Review E, 100(3):032308, 2019.
  67. On the effectiveness of persistent homology. Advances in Neural Information Processing Systems, 35:35432–35448, 2022.
  68. Visualizing data using t-SNE. Journal of Machine Learning Research, 9(11):2579–2605, 2008.
  69. Robust persistence diagrams using reproducing kernels. Advances in Neural Information Processing Systems, 33:21900–21911, 2020.
  70. Getting lost in space: Large sample analysis of the resistance distance. Advances in Neural Information Processing Systems, 23, 2010a.
  71. Hitting and commute times in large graphs are often misleading. arXiv preprint arXiv:1003.1266, 2010b. version 1 from Mar 05.
  72. Hitting and commute times in large random neighborhood graphs. Journal of Machine Learning Research, 15(1):1751–1798, 2014.
  73. A topology-based network tree for the prediction of protein–protein binding affinity changes following mutation. Nature Machine Intelligence, 2(2):116–123, 2020.
  74. What cannot be seen correctly in 2D visualizations of single-cell ‘omics data? Cell Systems, 14(9):723–731, 2023.
  75. PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells. Genome Biology, 20:1–9, 2019.
  76. Universal prediction of cell-cycle position using transfer learning. Genome Biology, 23(1):1–27, 2022.
  77. A. Zomorodian and G. Carlsson. Computing persistent homology. In Proceedings of the twentieth annual symposium on Computational geometry, pages 347–356, 2004.
Citations (1)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com