On a linear fused Gromov-Wasserstein distance for graph structured data (2203.04711v1)
Abstract: We present a framework for embedding graph structured data into a vector space, taking into account node features and topology of a graph into the optimal transport (OT) problem. Then we propose a novel distance between two graphs, named linearFGW, defined as the Euclidean distance between their embeddings. The advantages of the proposed distance are twofold: 1) it can take into account node feature and structure of graphs for measuring the similarity between graphs in a kernel-based framework, 2) it can be much faster for computing kernel matrix than pairwise OT-based distances, particularly fused Gromov-Wasserstein, making it possible to deal with large-scale data sets. After discussing theoretical properties of linearFGW, we demonstrate experimental results on classification and clustering tasks, showing the effectiveness of the proposed linearFGW.
- On a linear gromov-wasserstein distance. arXiv preprint arXiv:2112.11964, 2021.
- Shortest-path kernels on graphs. In Fifth IEEE international conference on data mining (ICDM’05), pages 8–pp. IEEE, 2005.
- Protein function prediction via graph kernels. Bioinformatics, 21(suppl_1):i47–i56, 2005.
- M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013.
- Distinguishing enzyme structures from non-enzymes without alignments. Journal of molecular biology, 330(4):771–783, 2003.
- Scalable kernels for graphs with continuous attributes. Advances in neural information processing systems, 26, 2013.
- Marginalized kernels between labeled graphs. In Proceedings of the 20th international conference on machine learning (ICML-03), pages 321–328, 2003.
- Wasserstein embedding for graph learning. arXiv preprint arXiv:2006.09430, 2020.
- R. Luss and A. d’Aspremont. Support vector machine classification with indefinite kernels. Advances in neural information processing systems, 20, 2007.
- F. Mémoli. Gromov–wasserstein distances and the metric approach to object matching. Foundations of computational mathematics, 11(4):417–487, 2011.
- A trainable optimal transport embedding for feature aggregation and its relationship to attention. arXiv preprint arXiv:2006.12065, 2020.
- Faster kernels for graphs with continuous attributes via hashing. In 2016 IEEE 16th International Conference on Data Mining (ICDM), pages 1095–1100. IEEE, 2016.
- Learning subtree pattern importance for weisfeiler-lehman based graph kernels. Machine Learning, 110(7):1585–1607, 2021.
- Gromov-wasserstein averaging of kernel and distance matrices. In International Conference on Machine Learning, pages 2664–2672. PMLR, 2016.
- K. Riesen and H. Bunke. Iam graph database repository for graph based pattern recognition and machine learning. In Joint IAPR International Workshops on Statistical Techniques in Pattern Recognition (SPR) and Structural and Syntactic Pattern Recognition (SSPR), pages 287–297. Springer, 2008.
- J. Scott. Social network analysis: developments, advances, and prospects. Social network analysis and mining, 1(1):21–26, 2011.
- R. Sharan and T. Ideker. Modeling cellular machinery through biological network comparison. Nature biotechnology, 24(4):427–433, 2006.
- Weisfeiler-lehman graph kernels. Journal of Machine Learning Research, 12(9), 2011.
- R. Sinkhorn. Diagonal equivalence to matrices with prescribed row and column sums. The American Mathematical Monthly, 74(4):402–405, 1967.
- Spline-fitting with a genetic algorithm: A method for developing classification structure- activity relationships. Journal of chemical information and computer sciences, 43(6):1906–1915, 2003.
- Optimal transport for structured data with application on graphs. In International Conference on Machine Learning, pages 6275–6284. PMLR, 2019.
- Wasserstein weisfeiler-lehman graph kernels. Advances in Neural Information Processing Systems, 32, 2019.
- N. Trinajstic. Chemical graph theory. Routledge, 2018.
- C. Villani. The wasserstein distances. In Optimal transport, pages 93–111. Springer, 2009.
- A linear optimal transportation framework for quantifying and visualizing variations in sets of images. International journal of computer vision, 101(2):254–269, 2013.
- H. Xu. Gromov-wasserstein factorization models for graph clustering. In Proceedings of the AAAI conference on artificial intelligence, volume 34, pages 6478–6485, 2020.
- P. Yanardag and S. Vishwanathan. Deep graph kernels. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining, pages 1365–1374, 2015.