Differentiable Cluster Graph Neural Network (2405.16185v1)
Abstract: Graph Neural Networks often struggle with long-range information propagation and in the presence of heterophilous neighborhoods. We address both challenges with a unified framework that incorporates a clustering inductive bias into the message passing mechanism, using additional cluster-nodes. Central to our approach is the formulation of an optimal transport based implicit clustering objective function. However, the algorithm for solving the implicit objective function needs to be differentiable to enable end-to-end learning of the GNN. To facilitate this, we adopt an entropy regularized objective function and propose an iterative optimization process, alternating between solving for the cluster assignments and updating the node/cluster-node embeddings. Notably, our derived closed-form optimization steps are themselves simple yet elegant message passing steps operating seamlessly on a bipartite graph of nodes and cluster-nodes. Our clustering-based approach can effectively capture both local and global information, demonstrated by extensive experiments on both heterophilous and homophilous datasets.
- Mixhop: Higher-order graph convolutional architectures via sparsified neighborhood mixing. In ICML, pages 21–29. PMLR, 2019.
- On the bottleneck of graph neural networks and its practical implications. arXiv preprint arXiv:2006.05205, 2020.
- Diffwire: Inductive graph rewiring via the lovász bound. In The First Learning on Graphs Conference, 2022.
- Oversquashing in gnns through the lens of information contraction and graph expansion. In 2022 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton), pages 1–8. IEEE, 2022.
- Optimal transport graph neural networks. arXiv preprint arXiv:2006.04804, 2020.
- Spectral clustering with graph neural networks for graph pooling. In International Conference on Machine Learning, pages 874–883. PMLR, 2020.
- Understanding oversquashing in gnns through the lens of effective resistance. In International Conference on Machine Learning, pages 2528–2547. PMLR, 2023.
- Beyond low-frequency information in graph convolutional networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 3950–3957, 2021.
- Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.
- A note on over-smoothing for graph neural networks. arXiv preprint arXiv:2006.13318, 2020.
- Deep clustering for unsupervised learning of visual features. In Proceedings of the European conference on computer vision (ECCV), pages 132–149, 2018.
- Better and simpler error analysis of the sinkhorn–knopp algorithm for matrix scaling. Mathematical Programming, 188(1):395–407, 2021.
- Simple and deep graph convolutional networks. In ICML, pages 1725–1735. PMLR, 2020.
- Adaptive universal generalized pagerank graph neural network. arXiv preprint arXiv:2006.07988, 2020.
- Marco Cuturi. Sinkhorn distances: Lightspeed computation of optimal transport. Advances in neural information processing systems, 26, 2013.
- Convolutional neural networks on graphs with fast localized spectral filtering. NeurIPS, 29:3844–3852, 2016.
- Adagnn: Graph neural networks with adaptive frequency response filter. In International Conference on Information and Knowledge Management, pages 392–401. ACM, 2021.
- Gbk-gnn: Gated bi-kernel graph neural networks for modeling both homophily and heterophily. In Proceedings of the ACM Web Conference 2022, pages 1550–1558, 2022.
- Neuralizing efficient higher-order belief propagation. arXiv preprint arXiv:2010.09283, 2020.
- Higher-order clustering and pooling for graph neural networks. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pages 426–435, 2022.
- How powerful are k-hop message passing graph neural networks. Advances in Neural Information Processing Systems, 35:4776–4790, 2022.
- Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
- On the scaling of multidimensional matrices. Linear Algebra and its applications, 114:717–735, 1989.
- p-laplacian based graph neural networks. In International Conference on Machine Learning, volume 162, pages 6878–6917. PMLR, 2022.
- Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
- Inductive representation learning on large graphs. In NeurIPS, pages 1025–1035, 2017.
- Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 770–778, 2016.
- Residual correlation in graph neural network regression. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 588–598, 2020.
- Universal graph convolutional networks. In Advances in Neural Information Processing Systems, pages 10654–10664, 2021.
- Fosr: First-order spectral rewiring for addressing oversquashing in gnns. arXiv preprint arXiv:2210.11790, 2022.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Predict then propagate: Graph neural networks meet personalized pagerank. arXiv preprint arXiv:1810.05997, 2018.
- Deeper insights into graph convolutional networks for semi-supervised learning. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
- Finding global homophily in graph neural networks when meeting heterophily. arXiv preprint arXiv:2205.07308, 2022.
- Representation transfer by optimal transport. arXiv preprint arXiv:2007.06737, 2020.
- Large scale learning on non-homophilous graphs: New benchmarks and strong simple methods. Advances in Neural Information Processing Systems, 34:20887–20902, 2021.
- Is heterophily a real nightmare for graph neural networks to do node classification? arXiv preprint arXiv:2109.05641, 2021.
- Gps++: An optimised hybrid mpnn/transformer for molecular property prediction. arXiv preprint arXiv:2212.02229, 2022.
- Simplifying approach to node classification in graph neural networks. Journal of Computational Science, 62:101695, 2022.
- MV Menon. Matrix links, an extremization problem, and the reduction of a non-negative matrix to one with prescribed row and column sums. Canadian Journal of Mathematics, 20:225–232, 1968.
- Attending to graph transformers. arXiv preprint arXiv:2302.04181, 2023.
- Revisiting over-smoothing and over-squashing using ollivier-ricci curvature. In International Conference on Machine Learning, pages 25956–25979. PMLR, 2023.
- Geom-gcn: Geometric graph convolutional networks. arXiv preprint arXiv:2002.05287, 2020.
- Fast and robust earth mover’s distances. In 2009 IEEE 12th international conference on computer vision, pages 460–467. IEEE, 2009.
- A critical look at the evaluation of gnns under heterophily: are we really making progress? arXiv preprint arXiv:2302.11640, 2023.
- Recipe for a general, powerful, scalable graph transformer. Advances in Neural Information Processing Systems, 35:14501–14515, 2022.
- A survey on oversmoothing in graph neural networks. arXiv preprint arXiv:2303.10993, 2023.
- End-to-end differentiable clustering with associative memories. In International Conference on Machine Learning, pages 29649–29670. PMLR, 2023.
- Neural enhanced belief propagation on factor graphs. In International Conference on Artificial Intelligence and Statistics, pages 685–693. PMLR, 2021.
- Masked label prediction: Unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509, 2020.
- Richard Sinkhorn. Diagonal equivalence to matrices with prescribed row and column sums. The American Mathematical Monthly, 74(4):402–405, 1967.
- Concerning nonnegative matrices and doubly stochastic matrices. Pacific Journal of Mathematics, 21(2):343–348, 1967.
- George W Soules. The rate of convergence of sinkhorn balancing. Linear algebra and its applications, 150:3–40, 1991.
- Differentiable clustering with perturbed spanning forests. Advances in Neural Information Processing Systems, 36, 2024.
- Breaking the limit of graph neural networks by improving the assortativity of graphs with local mixing patterns. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 2021.
- Learning deep representations for graph clustering. In Proceedings of the AAAI conference on artificial intelligence, volume 28, 2014.
- Understanding over-squashing and bottlenecks on graphs via curvature. arXiv preprint arXiv:2111.14522, 2021.
- Graph clustering with graph neural networks. Journal of Machine Learning Research, 24(127):1–21, 2023.
- Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
- Cédric Villani et al. Optimal transport: old and new, volume 338. Springer, 2009.
- Template based graph neural network with optimal transport distances. Advances in Neural Information Processing Systems, 35:11800–11814, 2022.
- How powerful are spectral graph neural networks. In International Conference on Machine Learning, pages 23341–23362. PMLR, 2022.
- Simplifying graph convolutional networks. In International conference on machine learning, pages 6861–6871. PMLR, 2019.
- Hp-gmn: Graph memory networks for heterophilous graphs. arXiv preprint arXiv:2210.08195, 2022.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
- Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks. arXiv preprint arXiv:2102.06462, 2021.
- Toward unsupervised graph neural network: Interactive clustering and embedding via optimal transport. In 2020 IEEE international conference on data mining (ICDM), pages 1358–1363. IEEE, 2020.
- Revisiting semi-supervised learning with graph embeddings. In International conference on machine learning, pages 40–48. PMLR, 2016.
- Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 974–983, 2018.
- Inference in probabilistic graphical models by graph neural networks. In 2019 53rd Asilomar Conference on Signals, Systems, and Computers, pages 868–875. IEEE, 2019.
- Factor graph neural networks. Advances in Neural Information Processing Systems, 33:8577–8587, 2020.
- Factor graph neural networks. Journal of Machine Learning Research, 24(181):1–54, 2023. URL http://jmlr.org/papers/v24/21-0434.html.
- Graph neural networks: A review of methods and applications. AI Open, 1:57–81, 2020.
- Understanding and resolving performance degradation in deep graph convolutional networks. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 2728–2737, 2021.
- Graph neural networks with heterophily. arXiv preprint arXiv:2009.13566, 2020a.
- Beyond homophily in graph neural networks: Current limitations and effective designs. Advances in Neural Information Processing Systems, 33:7793–7804, 2020b.
- Graph neural networks with heterophily. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, pages 11168–11176, 2021.