Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
167 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Multi-scale Wasserstein Shortest-path Graph Kernels for Graph Classification (2206.00979v5)

Published 2 Jun 2022 in cs.LG and cs.AI

Abstract: Graph kernels are conventional methods for computing graph similarities. However, the existing R-convolution graph kernels cannot resolve both of the two challenges: 1) Comparing graphs at multiple different scales, and 2) Considering the distributions of substructures when computing the kernel matrix. These two challenges limit their performances. To mitigate both of the two challenges, we propose a novel graph kernel called the Multi-scale Wasserstein Shortest-Path graph kernel (MWSP), at the heart of which is the multi-scale shortest-path node feature map, of which each element denotes the number of occurrences of the shortest path around a node. The shortest path is represented by the concatenation of all the labels of nodes in it. Since the shortest-path node feature map can only compare graphs at local scales, we incorporate into it the multiple different scales of the graph structure, which are captured by the truncated BFS trees of different depths rooted at each node in a graph. We use the Wasserstein distance to compute the similarity between the multi-scale shortest-path node feature maps of two graphs, considering the distributions of shortest paths. We empirically validate MWSP on various benchmark graph datasets and demonstrate that it achieves state-of-the-art performance on most datasets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (61)
  1. D. Haussler, “Convolution kernels on discrete structures,” Technical report, Department of Computer Science, University of California at Santa Cruz, Tech. Rep., 1999.
  2. T. Gärtner, P. Flach, and S. Wrobel, “On graph kernels: Hardness results and efficient alternatives,” in Learning theory and kernel machines.   Springer, 2003, pp. 129–143.
  3. S. V. N. Vishwanathan, N. N. Schraudolph, R. Kondor, and K. M. Borgwardt, “Graph kernels,” Journal of Machine Learning Research, vol. 11, no. Apr, pp. 1201–1242, 2010.
  4. Z. Zhang, M. Wang, Y. Xiang, Y. Huang, and A. Nehorai, “Retgk: Graph kernels based on return probabilities of random walks,” Advances in Neural Information Processing Systems, vol. 31, 2018.
  5. K. M. Borgwardt and H.-P. Kriegel, “Shortest-path kernels on graphs,” in International Conference on Data Mining.   IEEE, 2005, pp. 8–pp.
  6. W. Ye, Z. Wang, R. Redberg, and A. Singh, “Tree++: Truncated tree based graph kernels,” IEEE Transactions on Knowledge and Data Engineering, 2019.
  7. N. Shervashidze, S. Vishwanathan, T. Petri, K. Mehlhorn, and K. Borgwardt, “Efficient graphlet kernels for large graph comparison,” in Artificial Intelligence and Statistics, 2009, pp. 488–495.
  8. F. Costa and K. D. Grave, “Fast neighborhood subgraph pairwise distance kernel,” in International Conference on Machine Learning.   Omnipress, 2010, pp. 255–262.
  9. T. Horváth, T. Gärtner, and S. Wrobel, “Cyclic pattern kernels for predictive graph mining,” in SIGKDD international conference on Knowledge discovery and data mining.   ACM, 2004, pp. 158–167.
  10. R. Kondor and H. Pan, “The multiscale laplacian graph kernel,” in Advances in Neural Information Processing Systems, 2016, pp. 2990–2998.
  11. J. Ramon and T. Gärtner, “Expressivity versus efficiency of graph kernels,” in Proceedings of the first international workshop on mining graphs, trees and sequences, 2003, pp. 65–74.
  12. P. Mahé and J.-P. Vert, “Graph kernels based on tree patterns for molecules,” Machine learning, vol. 75, no. 1, pp. 3–35, 2009.
  13. N. Shervashidze and K. M. Borgwardt, “Fast subtree kernels on graphs,” in Advances in neural information processing systems, 2009, pp. 1660–1668.
  14. N. Shervashidze, P. Schweitzer, E. J. v. Leeuwen, K. Mehlhorn, and K. M. Borgwardt, “Weisfeiler-lehman graph kernels,” Journal of Machine Learning Research, vol. 12, no. Sep, pp. 2539–2561, 2011.
  15. G. Da San Martino, N. Navarin, and A. Sperduti, “A tree-based kernel for graphs,” in Proceedings of the 2012 SIAM International Conference on Data Mining.   SIAM, 2012, pp. 975–986.
  16. M. Togninalli, E. Ghisu, F. Llinares-López, B. A. Rieck, and K. Borgwardt, “Wasserstein weisfeiler-lehman graph kernels,” Advances in Neural Information Processing Systems 32, vol. 9, pp. 6407–6417, 2020.
  17. A. Leman and B. Weisfeiler, “A reduction of a graph to a canonical form and an algebra arising during this reduction,” Nauchno-Technicheskaya Informatsiya, vol. 2, no. 9, pp. 12–16, 1968.
  18. R. Kondor and T. Jebara, “A kernel between sets of vectors,” in Proceedings of the 20th international conference on machine learning, 2003, pp. 361–368.
  19. D. J. Watts and S. H. Strogatz, “Collective dynamics of ‘small-world’networks,” nature, vol. 393, no. 6684, pp. 440–442, 1998.
  20. R. Andersen, F. Chung, and K. Lang, “Local graph partitioning using pagerank vectors,” in 2006 47th Annual IEEE Symposium on Foundations of Computer Science.   IEEE, 2006, pp. 475–486.
  21. F. Chung, “The heat kernel as the pagerank of a graph,” Proceedings of the National Academy of Sciences, vol. 104, no. 50, pp. 19 735–19 740, 2007.
  22. A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. Smola, “A kernel two-sample test,” The Journal of Machine Learning Research, vol. 13, no. 1, pp. 723–773, 2012.
  23. M. Sugiyama and K. Borgwardt, “Halting in random walk kernels,” Advances in neural information processing systems, vol. 28, 2015.
  24. N. Pržulj, D. G. Corneil, and I. Jurisica, “Modeling interactome: scale-free or geometric?” Bioinformatics, vol. 20, no. 18, pp. 3508–3515, 2004.
  25. F. Bause and N. M. Kriege, “Gradual weisfeiler-leman: Slow and steady wins the race,” in Learning on Graphs Conference.   PMLR, 2022, pp. 20–1.
  26. G. Nikolentzos and M. Vazirgiannis, “Graph alignment kernels using weisfeiler and leman hierarchies,” in International Conference on Artificial Intelligence and Statistics.   PMLR, 2023, pp. 2019–2034.
  27. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” International Conference on Learning Representations, 2016.
  28. M. Li, Z. Ma, Y. G. Wang, and X. Zhuang, “Fast haar transforms for graph neural networks,” Neural Networks, vol. 128, pp. 188–198, 2020.
  29. Y. G. Wang, M. Li, Z. Ma, G. Montufar, X. Zhuang, and Y. Fan, “Haar graph pooling,” in International conference on machine learning.   PMLR, 2020, pp. 9952–9962.
  30. C. Huang, M. Li, F. Cao, H. Fujita, Z. Li, and X. Wu, “Are graph convolutional networks with random weights feasible?” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 3, pp. 2751–2768, 2023.
  31. N. M. Kriege, P.-L. Giscard, and R. Wilson, “On valid optimal assignment kernels and applications to graph classification,” in Advances in Neural Information Processing Systems, 2016, pp. 1623–1631.
  32. T. H. Schulz, P. Welke, and S. Wrobel, “Graph filtration kernels,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2022.
  33. T. Mikolov, K. Chen, G. Corrado, and J. Dean, “Efficient estimation of word representations in vector space,” arXiv preprint arXiv:1301.3781, 2013.
  34. G. Nikolentzos, P. Meladianos, and M. Vazirgiannis, “Matching node embeddings for graph similarity,” in Thirty-first AAAI conference on artificial intelligence, 2017.
  35. Y. Rubner, C. Tomasi, and L. J. Guibas, “The earth mover’s distance as a metric for image retrieval,” International journal of computer vision, vol. 40, no. 2, pp. 99–121, 2000.
  36. S. Lazebnik, C. Schmid, and J. Ponce, “Beyond bags of features: Spatial pyramid matching for recognizing natural scene categories,” in 2006 IEEE computer society conference on computer vision and pattern recognition, vol. 2.   IEEE, 2006, pp. 2169–2178.
  37. K. Grauman and T. Darrell, “The pyramid match kernel: Efficient learning with sets of features.” Journal of Machine Learning Research, vol. 8, no. 4, 2007.
  38. L. Bai, L. Cui, and H. Edwin, “A hierarchical transitive-aligned graph kernel for un-attributed graphs,” in International Conference on Machine Learning.   PMLR, 2022, pp. 1327–1336.
  39. S. S. Du, K. Hou, B. Póczos, R. Salakhutdinov, R. Wang, and K. Xu, “Graph neural tangent kernel: Fusing graph neural networks with graph kernels,” in Advances in Neural Information Processing Systems, 2019.
  40. A. Jacot, F. Gabriel, and C. Hongler, “Neural tangent kernel: Convergence and generalization in neural networks,” in Advances in neural information processing systems, 2018, pp. 8571–8580.
  41. S. Arora, S. S. Du, W. Hu, Z. Li, R. R. Salakhutdinov, and R. Wang, “On exact computation with an infinitely wide neural net,” Advances in neural information processing systems, vol. 32, 2019.
  42. Y. Tang and J. Yan, “Graphqntk: Quantum neural tangent kernel for graph data,” in Advances in neural information processing systems, 2022.
  43. A. Figalli and C. Villani, “Optimal transport and curvature,” in Nonlinear PDE’s and Applications.   Springer, 2011, pp. 171–217.
  44. B. Haasdonk and C. Bahlmann, “Learning with distance substitution kernels,” in Joint pattern recognition symposium.   Springer, 2004, pp. 220–227.
  45. G. Loosli, S. Canu, and C. S. Ong, “Learning svm in kreĭn spaces,” IEEE transactions on pattern analysis and machine intelligence, vol. 38, no. 6, pp. 1204–1216, 2015.
  46. M. Togninalli, E. Ghisu, F. Llinares-López, B. Rieck, and K. Borgwardt, “Wasserstein weisfeiler-lehman graph kernels,” Advances in Neural Information Processing Systems, vol. 32, 2019.
  47. G. Siglidis, G. Nikolentzos, S. Limnios, C. Giatsidis, K. Skianis, and M. Vazirgiannis, “Grakel: A graph kernel library in python,” The Journal of Machine Learning Research, vol. 21, no. 1, pp. 1993–1997, 2020.
  48. C.-C. Chang and C.-J. Lin, “Libsvm: a library for support vector machines,” ACM Transactions on Intelligent Systems and Technology, vol. 2, no. 3, p. 27, 2011.
  49. K. Kersting, N. M. Kriege, C. Morris, P. Mutzel, and M. Neumann, “Benchmark data sets for graph kernels,” 2016. [Online]. Available: http://graphkernels.cs.tu-dortmund.de
  50. A. K. Debnath, R. L. Lopez de Compadre, G. Debnath, A. J. Shusterman, and C. Hansch, “Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro compounds. correlation with molecular orbital energies and hydrophobicity,” Journal of medicinal chemistry, vol. 34, no. 2, pp. 786–797, 1991.
  51. J. J. Sutherland, L. A. O’brien, and D. F. Weaver, “Spline-fitting with a genetic algorithm: A method for developing classification structure- activity relationships,” Journal of chemical information and computer sciences, vol. 43, no. 6, pp. 1906–1915, 2003.
  52. N. Wale, I. A. Watson, and G. Karypis, “Comparison of descriptor spaces for chemical compound retrieval and classification,” Knowledge and Information Systems, vol. 14, no. 3, pp. 347–375, 2008.
  53. N. Kriege and P. Mutzel, “Subgraph matching kernels for attributed graphs,” in Proceedings of the 29th International Coference on International Conference on Machine Learning, 2012, pp. 291–298.
  54. K. M. Borgwardt, C. S. Ong, S. Schönauer, S. Vishwanathan, A. J. Smola, and H.-P. Kriegel, “Protein function prediction via graph kernels,” Bioinformatics, vol. 21, no. suppl_1, pp. i47–i56, 2005.
  55. P. D. Dobson and A. J. Doig, “Distinguishing enzyme structures from non-enzymes without alignments,” Journal of molecular biology, vol. 330, no. 4, pp. 771–783, 2003.
  56. P. Yanardag and S. Vishwanathan, “Deep graph kernels,” in Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining.   ACM, 2015, pp. 1365–1374.
  57. N. M. Kriege, M. Fey, D. Fisseler, P. Mutzel, and F. Weichert, “Recognizing cuneiform signs using graph based methods,” in International Workshop on Cost-Sensitive Learning.   PMLR, 2018, pp. 31–44.
  58. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in International Conference on Learning Representations, 2019.
  59. A. Rahimi and B. Recht, “Random features for large-scale kernel machines,” Advances in neural information processing systems, vol. 20, 2007.
  60. J. Altschuler, J. Niles-Weed, and P. Rigollet, “Near-linear time approximation algorithms for optimal transport via sinkhorn iteration,” Advances in neural information processing systems, vol. 30, 2017.
  61. M. Cuturi, “Sinkhorn distances: Lightspeed computation of optimal transport,” Advances in neural information processing systems, vol. 26, 2013.
Citations (1)

Summary

We haven't generated a summary for this paper yet.