Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Understanding Virtual Nodes: Oversmoothing, Oversquashing, and Node Heterogeneity (2405.13526v1)

Published 22 May 2024 in cs.LG

Abstract: Message passing neural networks (MPNNs) have been shown to have limitations in terms of expressivity and modeling long-range interactions. Augmenting MPNNs with a virtual node (VN) removes the locality constraint of the layer aggregation and has been found to improve performance on a range of benchmarks. We provide a comprehensive theoretical analysis of the role of VNs and benefits thereof, through the lenses of oversmoothing, oversquashing, and sensitivity analysis. First, in contrast to prior belief, we find that VNs typically avoid replicating anti-smoothing approaches to maintain expressive power. Second, we characterize, precisely, how the improvement afforded by VNs on the mixing abilities of the network and hence in mitigating oversquashing, depends on the underlying topology. Finally, we highlight that, unlike Graph-Transformers (GT), classical instantiations of the VN are often constrained to assign uniform importance to different nodes. Consequently, we propose a variant of VN with the same computational complexity, which can have different sensitivity to nodes based on the graph structure. We show that this is an extremely effective and computationally efficient baseline on graph-level tasks.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (59)
  1. U. Alon and E. Yahav. On the bottleneck of graph neural networks and its practical implications. In International Conference on Learning Representations, 2021. URL https://openreview.net/forum?id=i80OPhOCVH2.
  2. Diffwire: Inductive graph rewiring via the lovász bound. In The First Learning on Graphs Conference, 2022. URL https://openreview.net/forum?id=IXvfIex0mX6f.
  3. Locality-aware graph-rewiring in gnns. arXiv preprint arXiv:2310.01668, 2023.
  4. Relational inductive biases, deep learning, and graph networks. arXiv preprint arXiv:1806.01261, 2018.
  5. F. M. Bianchi and V. Lachi. The expressive power of pooling in graph neural networks. In Thirty-seventh Conference on Neural Information Processing Systems, 2023.
  6. Understanding oversquashing in gnns through the lens of effective resistance. In International Conference on Machine Learning, pages 2528–2547. PMLR, 2023.
  7. Improving graph neural network expressivity via subgraph isomorphism counting. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45(1):657–668, Jan. 2023. ISSN 1939-3539. doi: 10.1109/tpami.2022.3154319. URL http://dx.doi.org/10.1109/TPAMI.2022.3154319.
  8. C. Cai and Y. Wang. A note on over-smoothing for graph neural networks. arXiv preprint arXiv:2006.13318, 2020.
  9. On the connection between mpnn and graph transformer. arXiv preprint arXiv:2301.11956, 2023.
  10. Structure-aware transformer for graph representation learning. In International Conference on Machine Learning, pages 3469–3489. PMLR, 2022.
  11. Can graph neural networks count substructures? Advances in neural information processing systems, 33:10383–10395, 2020.
  12. Understanding the representation power of graph neural networks in learning graph topology. Advances in Neural Information Processing Systems, 32, 2019.
  13. On over-squashing in message passing neural networks: The impact of width, depth, and topology. In International Conference on Machine Learning, pages 7865–7885. PMLR, 2023a.
  14. Understanding convolution on graphs via energies. Transactions on Machine Learning Research, 2023b.
  15. How does over-squashing affect the power of gnns? arXiv preprint arXiv:2306.03589, 2023c.
  16. Graph neural networks with learnable structural and positional representations. In International Conference on Learning Representations, 2021.
  17. Long range graph benchmark. Advances in Neural Information Processing Systems, 35:22326–22340, 2022.
  18. Benchmarking graph neural networks. Journal of Machine Learning Research, 24(43):1–48, 2023.
  19. A large-scale database for graph representation learning. arXiv preprint arXiv:2011.07682, 2020.
  20. Minimizing effective resistance of a graph. SIAM review, 50(1):37–66, 2008.
  21. Neural message passing for quantum chemistry. In International conference on machine learning, pages 1263–1272. PMLR, 2017.
  22. A new model for learning in graph domains. In Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005., volume 2, pages 729–734 vol. 2, 2005. doi: 10.1109/IJCNN.2005.1555942.
  23. Understanding pooling in graph neural networks. IEEE Transactions on Neural Networks and Learning Systems, page 1–11, 2022. ISSN 2162-2388. doi: 10.1109/tnnls.2022.3190922. URL http://dx.doi.org/10.1109/TNNLS.2022.3190922.
  24. Drew: Dynamically rewired message passing with delay. In International Conference on Machine Learning, pages 12252–12267. PMLR, 2023.
  25. A generalization of vit/mlp-mixer to graphs. In International Conference on Machine Learning, pages 12724–12745. PMLR, 2023.
  26. Topological graph neural networks. In International Conference on Learning Representations, 2022. URL https://openreview.net/forum?id=oxxUMeFwEHd.
  27. Open graph benchmark: Datasets for machine learning on graphs. Advances in neural information processing systems, 33:22118–22133, 2020.
  28. Global self-attention as a replacement for graph convolution. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, KDD ’22. ACM, Aug. 2022. doi: 10.1145/3534678.3539296. URL http://dx.doi.org/10.1145/3534678.3539296.
  29. An analysis of virtual nodes in graph neural networks for link prediction. In The First Learning on Graphs Conference, 2022.
  30. D. P. Kingma and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  31. Rethinking graph transformers with spectral attention. Advances in Neural Information Processing Systems, 34:21618–21629, 2021.
  32. Self-attention graph pooling. In International conference on machine learning, pages 3734–3743. PMLR, 2019.
  33. A. Loukas. What graph neural networks cannot learn: depth vs width. In International Conference on Learning Representations, 2019.
  34. L. Lovász. Random walks on graphs. Combinatorics, Paul erdos is eighty, 2(1-46):4, 1993.
  35. Graph inductive biases in transformers without message passing. arXiv preprint arXiv:2305.17589, 2023.
  36. Invariant and equivariant graph networks. In International Conference on Learning Representations, 2018.
  37. A. Micheli. Neural network for graphs: A contextual constructive approach. IEEE Transactions on Neural Networks, 20(3):498–511, 2009. doi: 10.1109/TNN.2008.2010350.
  38. Weisfeiler and leman go neural: Higher-order graph neural networks. In Proceedings of the AAAI conference on artificial intelligence, volume 33, pages 4602–4609, 2019.
  39. K-hop graph neural networks. Neural Networks, 130:195–205, 2020.
  40. H. Nt and T. Maehara. Revisiting graph neural networks: All we have is low-pass filters. arXiv preprint arXiv:1905.09550, 2019.
  41. K. Oono and T. Suzuki. Graph neural networks exponentially lose expressive power for node classification. In International Conference on Learning Representations, 2020. URL https://openreview.net/forum?id=S1ldO2EFPr.
  42. Graph classification via deep learning with virtual nodes. arXiv preprint arXiv:1708.04357, 2017.
  43. Recipe for a general, powerful, scalable graph transformer. Advances in Neural Information Processing Systems, 35:14501–14515, 2022.
  44. Asap: Adaptive structure aware pooling for learning hierarchical graph representations. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, pages 5470–5477, 2020.
  45. Distinguished in uniform: Self-attention vs. virtual nodes. In The Twelfth International Conference on Learning Representations, 2024. URL https://openreview.net/forum?id=AcSChDWL6V.
  46. The graph neural network model. IEEE Transactions on Neural Networks, 20(1):61–80, 2009. doi: 10.1109/TNN.2008.2005605.
  47. N. Segol and Y. Lipman. On universal equivariant set networks. arXiv preprint arXiv:1910.02421, 2019.
  48. Vn-egnn: E (3)-equivariant graph neural networks with virtual nodes enhance protein binding site identification. arXiv preprint arXiv:2404.07194, 2024.
  49. Exphormer: Sparse transformers for graphs. arXiv preprint arXiv:2303.06147, 2023.
  50. Where did the gap go? reassessing the long-range graph benchmark. arXiv preprint arXiv:2309.00367, 2023a.
  51. Walking out of the weisfeiler leman hierarchy: Graph learning beyond message passing. Transactions on Machine Learning Research, 2023b.
  52. Understanding over-squashing and bottlenecks on graphs via curvature. arXiv preprint arXiv:2111.14522, 2021.
  53. Multi-hop attention graph neural network. arXiv preprint arXiv:2009.14332, 2020.
  54. X. Wang and M. Zhang. How powerful are spectral graph neural networks. In International Conference on Machine Learning, pages 23341–23362. PMLR, 2022.
  55. B. Weisfeiler and A. Leman. The reduction of a graph to canonical form and the algebra which appears therein. nti, Series, 2(9):12–16, 1968.
  56. Representing long-range context for graph neural networks with global attention. Advances in Neural Information Processing Systems, 34:13266–13279, 2021.
  57. How powerful are graph neural networks? In International Conference on Learning Representations, 2018.
  58. Do transformers really perform badly for graph representation? Advances in Neural Information Processing Systems, 34:28877–28888, 2021.
  59. L. Zhao and L. Akoglu. Pairnorm: Tackling oversmoothing in gnns. arXiv preprint arXiv:1909.12223, 2019.
Citations (6)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets