The Weisfeiler-Lehman Distance: Reinterpretation and Connection with GNNs (2302.00713v3)
Abstract: In this paper, we present a novel interpretation of the so-called Weisfeiler-Lehman (WL) distance, introduced by Chen et al. (2022), using concepts from stochastic processes. The WL distance aims at comparing graphs with node features, has the same discriminative power as the classic Weisfeiler-Lehman graph isomorphism test and has deep connections to the Gromov-Wasserstein distance. This new interpretation connects the WL distance to the literature on distances for stochastic processes, which also makes the interpretation of the distance more accessible and intuitive. We further explore the connections between the WL distance and certain Message Passing Neural Networks, and discuss the implications of the WL distance for understanding the Lipschitz property and the universal approximation results for these networks.
- Gradient flows: in metric spaces and in the space of probability measures. Springer Science & Business Media, 2005.
- Azizian, W. et al. Expressive power of invariant and equivariant graph neural networks. In International Conference on Learning Representations, 2020.
- Causal transport in discrete time and applications. SIAM Journal on Optimization, 27(4):2528–2562, 2017.
- Adapted Wasserstein distances and stability in mathematical finance. Finance and Stochastics, 24(3):601–632, 2020.
- Distances for Markov chains, and their differentiation. arXiv preprint arXiv:2302.08621, 2023.
- A course in metric geometry, volume 33. American Mathematical Soc., 2001.
- Weisfeiler-Lehman meets Gromov-Wasserstein. In International Conference on Machine Learning (ICML), pp. 3371–3416. PMLR, 2022.
- Tree mover’s distance: Bridging graph metrics and stability of graph neural networks. arXiv preprint arXiv:2210.01906, 2022.
- Durrett, R. Probability: theory and examples, volume 49. Cambridge university press, 2019.
- Computational methods for adapted optimal transport. arXiv preprint arXiv:2203.05005, 2022.
- Incorporating statistical model error into the calculation of acceptability prices of contingent claims. Mathematical Programming, 174(1):499–524, 2019.
- SPATE-GAN: Improved generative modeling of dynamic spatio-temporal patterns with an autoregressive embedding loss. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp. 4523–4531, 2022.
- Foundations of the theory of probability: Second English Edition. Courier Dover Publications, 2018.
- Lassalle, R. Causal transport plans and their Monge–Kantorovich problems. Stochastic Analysis and Applications, 36(3):452–484, 2018.
- A reduction of a graph to a canonical form and an algebra arising during this reduction. Nauchno-Technicheskaya Informatsiya, 2(9):12–16, 1968.
- Markov chains and mixing times, volume 107. American Mathematical Soc., 2017.
- Mémoli, F. Gromov-Wasserstein distances and the metric approach to object matching. Foundations of computational mathematics, 11(4):417–487, 2011.
- The nested sinkhorn divergence to learn the nested distance. Computational Management Science, 19(2):269–293, 2022.
- Wasserstein Weisfeiler-Lehman graph kernels. Advances in Neural Information Processing Systems, 32:6439–6449, 2019.
- Capturing graphs with hypo-elliptic diffusions. 36th Conference on Neural Information Processing Systems (NeurIPS 2022), 2022.
- Weaver, N. Order completeness in Lipschitz algebras. Journal of Functional Analysis, 130(1):118–130, 1995.
- How powerful are graph neural networks? In International Conference on Learning Representations, 2018.
- Conditional COT-GAN for video prediction with kernel smoothing. Workshop on Robustness in Sequence Modeling, 36th Conference on Neural Information Processing Systems, 2022.
- COT-GAN: Generating sequential data via causal optimal transport. Advances in Neural Information Processing Systems, 33:8798–8809, 2020.