Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
129 tokens/sec
GPT-4o
28 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Explaining the Power of Topological Data Analysis in Graph Machine Learning (2401.04250v1)

Published 8 Jan 2024 in cs.LG and cs.SI

Abstract: Topological Data Analysis (TDA) has been praised by researchers for its ability to capture intricate shapes and structures within data. TDA is considered robust in handling noisy and high-dimensional datasets, and its interpretability is believed to promote an intuitive understanding of model behavior. However, claims regarding the power and usefulness of TDA have only been partially tested in application domains where TDA-based models are compared to other graph machine learning approaches, such as graph neural networks. We meticulously test claims on TDA through a comprehensive set of experiments and validate their merits. Our results affirm TDA's robustness against outliers and its interpretability, aligning with proponents' arguments. However, we find that TDA does not significantly enhance the predictive power of existing methods in our specific experiments, while incurring significant computational costs. We investigate phenomena related to graph characteristics, such as small diameters and high clustering coefficients, to mitigate the computational expenses of TDA computations. Our results offer valuable perspectives on integrating TDA into graph machine learning tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (65)
  1. G. Carlsson, “Topology and data,” Bulletin of AMS, vol. 46, no. 2, pp. 255–308, 2009.
  2. Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A comprehensive survey on graph neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 1, pp. 4–24, 2020.
  3. C. G. Akcora, Y. Li, Y. R. Gel, and M. Kantarcioglu, “Bitcoinheist: topological data analysis for ransomware prediction on the bitcoin blockchain,” in Proceedings of the Twenty-Ninth International Conference on International Joint Conferences on Artificial Intelligence, 2021, pp. 4439–4445.
  4. Y. Chen, Y. Gel, and H. V. Poor, “Time-conditioned dances with simplicial complexes: Zigzag filtration curve based supra-hodge convolution networks for time-series forecasting,” Advances in Neural Information Processing Systems, vol. 35, pp. 8940–8953, 2022.
  5. Y. Chen, I. Segovia-Dominguez, B. Coskunuzer, and Y. Gel, “Tamp-s2gcnets: coupling time-aware multipersistence knowledge representation with spatio-supra graph convolutional networks for time-series forecasting,” in International Conference on Learning Representations, 2021.
  6. F. Chazal, D. Cohen-Steiner, L. J. Guibas, and S. Oudot, “The Stability of Persistence Diagrams Revisited,” Research Report RR-6568, 2008. [Online]. Available: https://inria.hal.science/inria-00292566
  7. F. Hensel, M. Moor, and B. Rieck, “A survey of topological machine learning methods,” Frontiers in Artificial Intelligence, vol. 4, p. 681108, 2021.
  8. N. Shervashidze, P. Schweitzer, E. J. Van Leeuwen, K. Mehlhorn, and K. M. Borgwardt, “Weisfeiler-lehman graph kernels.” Journal of Machine Learning Research, vol. 12, no. 9, 2011.
  9. C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe, “Weisfeiler and leman go neural: Higher-order graph neural networks,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4602–4609.
  10. L. Vietoris, “Über den höheren Zusammenhang kompakter Räume und eine Klasse von zusammenhangstreuen Abbildungen,” Mathematische Annalen, vol. 97, no. 1, pp. 454–472, 1927.
  11. V. Kovacev-Nikolic, P. Bubenik, D. Nikolić, and G. Heo, “Using persistent homology and dynamical distances to analyze protein binding,” Statistical applications in genetics and molecular biology, vol. 15, no. 1, pp. 19–38, 2016.
  12. Y. Liu, T. Safavi, A. Dighe, and D. Koutra, “Graph summarization methods and applications: A survey,” ACM computing surveys (CSUR), vol. 51, no. 3, pp. 1–34, 2018.
  13. N. Akkiraju, H. Edelsbrunner, M. Facello, P. Fu, E. Mucke, and C. Varela, “Alpha shapes: definition and software,” in Proceedings of the 1st international computational geometry software workshop, vol. 63, no. 66, 1995.
  14. N. Otter, M. A. Porter, U. Tillmann, P. Grindrod, and H. A. Harrington, “A roadmap for the computation of persistent homology,” EPJ Data Science, vol. 6, pp. 1–38, 2017.
  15. L. Van der Maaten and G. Hinton, “Visualizing data using t-sne.” Journal of machine learning research, vol. 9, no. 11, 2008.
  16. H. Hotelling, “Analysis of a complex of statistical variables into principal components.” Journal of educational psychology, vol. 24, no. 6, p. 417, 1933.
  17. J. B. Kruskal, “Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis,” Psychometrika, vol. 29, no. 1, pp. 1–27, 1964.
  18. D. B. Johnson, “A note on dijkstra’s shortest path algorithm,” Journal of the ACM (JACM), vol. 20, no. 3, pp. 385–388, 1973.
  19. D. Babić, D. J. Klein, I. Lukovits, S. Nikolić, and N. Trinajstić, “Resistance-distance matrix: a computational algorithm and its application,” International Journal of Quantum Chemistry, vol. 90, no. 1, pp. 166–176, 2002.
  20. T. de Surrel, F. Hensel, M. Carrière, T. Lacombe, Y. Ike, H. Kurihara, M. Glisse, and F. Chazal, “Ripsnet: a general architecture for fast and robust estimation of the persistent homology of point clouds,” arXiv preprint arXiv:2202.01725, 2022.
  21. K. Mischaikow and V. Nanda, “Morse theory for filtrations and efficient computation of persistent homology,” Discrete & Computational Geometry, vol. 50, no. 2, pp. 330–353, 2013.
  22. P. Bubenik et al., “Statistical topological data analysis using persistence landscapes.” J. Mach. Learn. Res., vol. 16, no. 1, pp. 77–102, 2015.
  23. F. Chazal, B. T. Fasy, F. Lecci, A. Rinaldo, and L. Wasserman, “Stochastic convergence of persistence landscapes and silhouettes,” in Proceedings of the thirtieth annual symposium on Computational geometry, 2014, pp. 474–483.
  24. K. Kersting, N. M. Kriege, C. Morris, P. Mutzel, and M. Neumann, “Benchmark data sets for graph kernels,” 2016, http://graphkernels.cs.tu-dortmund.de.
  25. M. Du, N. Liu, and X. Hu, “Techniques for interpretable machine learning,” Communications of the ACM, vol. 63, no. 1, pp. 68–77, 2019.
  26. U. Bauer, “Ripser: efficient computation of vietoris-rips persistence barcodes,” Journal of Applied and Computational Topology, 2021.
  27. V. Rouvreau, “Alpha complex,” in GUDHI User and Reference Manual, 3.5.0 ed.   GUDHI Editorial Board, 2022. [Online]. Available: https://gudhi.inria.fr/doc/3.5.0/group__alpha__complex.html
  28. N. Pezzotti, B. P. Lelieveldt, L. Van Der Maaten, T. Höllt, E. Eisemann, and A. Vilanova, “Approximated and user steerable tsne for progressive visual analytics,” IEEE transactions on visualization and computer graphics, vol. 23, no. 7, pp. 1739–1752, 2016.
  29. C. Zomorodian, “Zomorodian a., carlsson g,” Computing persistent homology, Discrete & Computational Geometry, vol. 33, no. 2, pp. 249–274, 2005.
  30. R. Chen, S. Zhang, Y. Li et al., “Redundancy-free message passing for graph neural networks,” Advances in Neural Information Processing Systems, vol. 35, pp. 4316–4327, 2022.
  31. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
  32. C. Hofer, R. Kwitt, M. Niethammer, and A. Uhl, “Deep learning with topological signatures,” Advances in neural information processing systems, vol. 30, 2017.
  33. D. Zügner, A. Akbarnejad, and S. Günnemann, “Adversarial attacks on neural networks for graph data,” in Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, 2018, pp. 2847–2856.
  34. L. Gosch, D. Sturm, S. Geisler, and S. Günnemann, “Revisiting robustness in graph machine learning,” arXiv preprint arXiv:2305.00851, 2023.
  35. Y. Skaf and R. Laubenbacher, “Topological data analysis in biomedicine: A review,” Journal of Biomedical Informatics, vol. 130, p. 104082, 2022.
  36. S. M. Lundberg and S.-I. Lee, “A unified approach to interpreting model predictions,” Advances in neural information processing systems, vol. 30, 2017.
  37. C. Cai and Y. Wang, “A simple yet effective baseline for non-attributed graph classification,” arXiv preprint arXiv:1811.03508, 2018.
  38. Q. Zhao and Y. Wang, “Learning metrics for persistence-based summaries and applications for graph classification,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  39. C. G. Akcora, M. Kantarcioglu, Y. Gel, and B. Coskunuzer, “Reduction algorithms for persistence diagrams of networks: Coraltda and prunit,” Advances in Neural Information Processing Systems, vol. 35, pp. 25 046–25 059, 2022.
  40. J. Leskovec, J. Kleinberg, and C. Faloutsos, “Graphs over time: densification laws, shrinking diameters and possible explanations,” in Proceedings of the eleventh ACM SIGKDD international conference on Knowledge discovery in data mining, 2005, pp. 177–187.
  41. B. Rieck, C. Bock, and K. Borgwardt, “A persistent weisfeiler-lehman procedure for graph classification,” in International Conference on Machine Learning.   PMLR, 2019, pp. 5448–5458.
  42. C. Morris, M. Fey, and N. M. Kriege, “The power of the weisfeiler-leman algorithm for machine learning with graphs,” arXiv preprint arXiv:2105.05911, 2021.
  43. J. J. Sutherland, L. A. O’brien, and D. F. Weaver, “Spline-fitting with a genetic algorithm: A method for developing classification structure- activity relationships,” Journal of chemical information and computer sciences, vol. 43, no. 6, pp. 1906–1915, 2003.
  44. N. Wale, I. A. Watson, and G. Karypis, “Comparison of descriptor spaces for chemical compound retrieval and classification,” Knowledge and Information Systems, vol. 14, no. 3, pp. 347–375, 2008.
  45. K. M. Borgwardt, C. S. Ong, S. Schönauer, S. Vishwanathan, A. J. Smola, and H.-P. Kriegel, “Protein function prediction via graph kernels,” Bioinformatics, vol. 21, no. suppl_1, pp. i47–i56, 2005.
  46. C. Morris, N. M. Kriege, F. Bause, K. Kersting, P. Mutzel, and M. Neumann, “Tudataset: A collection of benchmark datasets for learning with graphs,” arXiv preprint arXiv:2007.08663, 2020.
  47. I. Schomburg, A. Chang, C. Ebeling, M. Gremse, C. Heldt, G. Huhn, and D. Schomburg, “Brenda, the enzyme database: updates and major new developments,” Nucleic acids research, vol. 32, no. suppl_1, pp. D431–D433, 2004.
  48. A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Hansch, “Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity,” Journal of medicinal chemistry, vol. 34, no. 2, pp. 786–797, 1991.
  49. P. Yanardag and S. Vishwanathan, “Deep graph kernels,” in Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, 2015, pp. 1365–1374.
  50. G. Singh, F. Memoli, and G. Carlsson, “Topological methods for the analysis of high dimensional data sets and 3d object recognition,” in Eurographics Symposium on Point-Based Graphics, M. Botsch, R. Pajarola, B. Chen, and M. Zwicker, Eds.   The Eurographics Association, 2007.
  51. N. M. Kriege, F. D. Johansson, and C. Morris, “A survey on graph kernels,” Applied Network Science, vol. 5, no. 1, pp. 1–42, 2020.
  52. G. Nikolentzos, G. Siglidis, and M. Vazirgiannis, “Graph kernels: A survey,” Journal of Artificial Intelligence Research, vol. 72, pp. 943–1027, 2021.
  53. L. Cosmo, G. Minello, M. Bronstein, E. Rodolà, L. Rossi, and A. Torsello, “Graph kernel neural networks,” arXiv preprint arXiv:2112.07436, 2021.
  54. A. Sperduti and A. Starita, “Supervised neural networks for the classification of structures,” IEEE Transactions on Neural Networks, vol. 8, no. 3, pp. 714–735, 1997.
  55. M. Gori, G. Monfardini, and F. Scarselli, “A new model for learning in graph domains,” in Proceedings. 2005 IEEE international joint conference on neural networks, vol. 2, no. 2005, 2005, pp. 729–734.
  56. D. Chen, L. Jacob, and J. Mairal, “Convolutional kernel networks for graph-structured data,” in International Conference on Machine Learning.   PMLR, 2020, pp. 1576–1586.
  57. H. Edelsbrunner, D. Letscher, and A. Zomorodian, “Topological persistence and simplification,” in Proceedings 41st annual symposium on foundations of computer science.   IEEE, 2000, pp. 454–463.
  58. Z. Cang and G.-W. Wei, “Topologynet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions,” PLoS computational biology, vol. 13, no. 7, p. e1005690, 2017.
  59. ——, “Integration of element specific persistent homology and machine learning for protein-ligand binding affinity prediction,” International journal for numerical methods in biomedical engineering, vol. 34, no. 2, p. e2914, 2018.
  60. W. Bae, J. Yoo, and J. Chul Ye, “Beyond deep residual learning for image restoration: Persistent homology-guided manifold simplification,” in Proceedings of the IEEE conference on computer vision and pattern recognition workshops, 2017, pp. 145–153.
  61. D. Pachauri, C. Hinrichs, M. K. Chung, S. C. Johnson, and V. Singh, “Topology-based kernels with application to inference problems in alzheimer’s disease,” IEEE transactions on medical imaging, vol. 30, no. 10, pp. 1760–1770, 2011.
  62. U. Islambekov, M. Yuvaraj, and Y. R. Gel, “Harnessing the power of topological data analysis to detect change points,” Environmetrics, vol. 31, no. 1, p. e2612, 2020.
  63. A. Zomorodian and G. Carlsson, “Localized homology,” Computational Geometry, vol. 41, no. 3, pp. 126–148, 2008.
  64. C. S. Pun, K. Xia, and S. X. Lee, “Persistent-homology-based machine learning and its applications–a survey,” arXiv preprint arXiv:1811.00252, 2018.
  65. H. Edelsbrunner, J. Harer et al., “Persistent homology-a survey,” Contemporary mathematics, vol. 453, pp. 257–282, 2008.
Citations (1)

Summary

We haven't generated a summary for this paper yet.