Differentiable Mapper For Topological Optimization Of Data Representation (2402.12854v1)
Abstract: Unsupervised data representation and visualization using tools from topology is an active and growing field of Topological Data Analysis (TDA) and data science. Its most prominent line of work is based on the so-called Mapper graph, which is a combinatorial graph whose topological structures (connected components, branches, loops) are in correspondence with those of the data itself. While highly generic and applicable, its use has been hampered so far by the manual tuning of its many parameters-among these, a crucial one is the so-called filter: it is a continuous function whose variations on the data set are the main ingredient for both building the Mapper representation and assessing the presence and sizes of its topological structures. However, while a few parameter tuning methods have already been investigated for the other Mapper parameters (i.e., resolution, gain, clustering), there is currently no method for tuning the filter itself. In this work, we build on a recently proposed optimization framework incorporating topology to provide the first filter optimization scheme for Mapper graphs. In order to achieve this, we propose a relaxed and more general version of the Mapper graph, whose convergence properties are investigated. Finally, we demonstrate the usefulness of our approach by optimizing Mapper graph representations on several datasets, and showcasing the superiority of the optimized representation over arbitrary ones.
- Topological methods for the analysis of high dimensional data sets and 3d object recognition. PBG@ Eurographics, 2:091–100, 2007.
- Topological methods for visualization and analysis of high dimensional single-cell rna sequencing data. In BIOCOMPUTING 2019: Proceedings of the Pacific Symposium, pages 350–361. World Scientific, 2018.
- Topographical transcriptome mapping of the mouse medial ganglionic eminence by spatially resolved rna-seq. Genome biology, 15:1–12, 2014.
- Experiments on fraud detection use case with qml and tda mapper. In 2021 IEEE International Conference on Quantum Computing and Engineering (QCE), pages 471–472, 2021.
- Topological data analysis in conjunction with traditional machine learning techniques to predict future mdap pm ratings. Acquisition Research Program, 2021.
- Ziqi Wang. Exploration of topological data analysis in 3d printing. In 2020 International Conference on Information Science, Parallel and Distributed Systems (ISPDS), pages 150–153. IEEE, 2020.
- Inferring quality in point cloud-based 3d printed objects using topological data analysis. arXiv preprint arXiv:1807.02921, 2018.
- Single-cell topological RNA-seq analysis reveals insights into cellular differentiation and development. Nature Biotechnology, 35:551–560, 2017.
- A numerical measure of the instability of mapper-type algorithms. The Journal of Machine Learning Research, 21(1):8347–8391, 2020.
- Probabilistic convergence and stability of random Mapper graphs. Journal of Applied and Computational Topology, 5:99–140, 2021.
- Statistical analysis and parameter selection for mapper. The Journal of Machine Learning Research, 19(1):478–516, 2018.
- Ensemble mapper. Stat, 10(1):e405, 2021.
- Ensemble learning for mapper parameter optimization. In 2023 IEEE 35th International Conference on Tools with Artificial Intelligence (ICTAI), pages 129–134. IEEE, 2023.
- F-mapper: A fuzzy mapper clustering algorithm. Knowledge-Based Systems, 189:105107, 2020.
- Statistical analysis of mapper for stochastic and multivariate filters. Journal of Applied and Computational Topology, 6(3):331–369, 2022.
- Computational topology: an introduction. American Mathematical Society, 2010.
- An introduction to topological data analysis: fundamental and practical aspects for data scientists. Frontiers in artificial intelligence, 4:108, 2021.
- Perslay: A neural network layer for persistence diagrams and new graph topological signatures. In International Conference on Artificial Intelligence and Statistics, pages 2786–2796. PMLR, 2020.
- Extending persistence using poincaré and lefschetz duality. Foundations of Computational Mathematics, 9(1):79–103, 2009.
- Optimizing persistent homology based functions. In International conference on machine learning, pages 1294–1303. PMLR, 2021.
- Stochastic subgradient method converges on tame functions. Foundations of computational mathematics, 20(1):119–154, 2020.
- Alex Wilkie. Model completeness results for expansions of the ordered field of real numbers by restricted pfaffian functions and the exponential function. Journal of the American Mathematical Society, 9(4):1051–1094, 1996.
- Peter Bubenik. Statistical topological data analysis using persistence landscapes. Journal of Machine Learning Research, 16(3):77–102, 2015.
- Ziyad Oulhaj. Mapper filter optimization. https://github.com/ZiyadOulhaj/Mapper-Optimization, 2024.
- Single-cell rna-seq reveals lineage and x chromosome dynamics in human preimplantation embryos. Cell, 165(4):1012–1026, 2016.
- sctda. https://github.com/CamaraLab/scTDA. Accessed: 2024-01-23.
- Single-cell topological rna-seq analysis reveals insights into cellular differentiation and development. Nature biotechnology, 35(6):551–560, 2017.
- Michel Coste. An introduction to o-minimal geometry. Istituti editoriali e poligrafici internazionali Pisa, 2000.
- Optimal-transport analysis of single-cell gene expression identifies developmental trajectories in reprogramming. Cell, 176(4):928–943, 2019.
- Ziyad Oulhaj (3 papers)
- Mathieu Carrière (199 papers)
- Bertrand Michel (27 papers)