Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HyPE-GT: where Graph Transformers meet Hyperbolic Positional Encodings (2312.06576v1)

Published 11 Dec 2023 in cs.LG

Abstract: Graph Transformers (GTs) facilitate the comprehension of graph-structured data by calculating the self-attention of node pairs without considering node position information. To address this limitation, we introduce an innovative and efficient framework that introduces Positional Encodings (PEs) into the Transformer, generating a set of learnable positional encodings in the hyperbolic space, a non-Euclidean domain. This approach empowers us to explore diverse options for optimal selection of PEs for specific downstream tasks, leveraging hyperbolic neural networks or hyperbolic graph convolutional networks. Additionally, we repurpose these positional encodings to mitigate the impact of over-smoothing in deep Graph Neural Networks (GNNs). Comprehensive experiments on molecular benchmark datasets, co-author, and co-purchase networks substantiate the effectiveness of hyperbolic positional encodings in enhancing the performance of deep GNNs.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in Thirty-Second AAAI conference on artificial intelligence, 2018.
  2. U. Alon and E. Yahav, “On the bottleneck of graph neural networks and its practical implications,” arXiv preprint arXiv:2006.05205, 2020.
  3. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
  4. C. Morris, M. Ritzert, M. Fey, W. L. Hamilton, J. E. Lenssen, G. Rattan, and M. Grohe, “Weisfeiler and leman go neural: Higher-order graph neural networks,” in Proceedings of the AAAI conference on artificial intelligence, vol. 33, no. 01, 2019, pp. 4602–4609.
  5. D. Kreuzer, D. Beaini, W. Hamilton, V. Létourneau, and P. Tossou, “Rethinking graph transformers with spectral attention,” Advances in Neural Information Processing Systems, vol. 34, pp. 21 618–21 629, 2021.
  6. D. Chen, L. O’Bray, and K. Borgwardt, “Structure-aware transformer for graph representation learning,” in International Conference on Machine Learning.   PMLR, 2022, pp. 3469–3489.
  7. C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, “Do transformers really perform badly for graph representation?” Advances in Neural Information Processing Systems, vol. 34, pp. 28 877–28 888, 2021.
  8. V. P. Dwivedi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, “Graph neural networks with learnable structural and positional representations,” arXiv preprint arXiv:2110.07875, 2021.
  9. P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
  10. V. P. Dwivedi and X. Bresson, “A generalization of transformer networks to graphs,” arXiv preprint arXiv:2012.09699, 2020.
  11. G. Mialon, D. Chen, M. Selosse, and J. Mairal, “Graphit: Encoding graph structure in transformers,” arXiv preprint arXiv:2106.05667, 2021.
  12. Z. Wu, P. Jain, M. Wright, A. Mirhoseini, J. E. Gonzalez, and I. Stoica, “Representing long-range context for graph neural networks with global attention,” Advances in Neural Information Processing Systems, vol. 34, pp. 13 266–13 279, 2021.
  13. L. Rampášek, M. Galkin, V. P. Dwivedi, A. T. Luu, G. Wolf, and D. Beaini, “Recipe for a general, powerful, scalable graph transformer,” Advances in Neural Information Processing Systems, vol. 35, pp. 14 501–14 515, 2022.
  14. J. Kim, D. Nguyen, S. Min, S. Cho, M. Lee, H. Lee, and S. Hong, “Pure transformers are powerful graph learners,” Advances in Neural Information Processing Systems, vol. 35, pp. 14 582–14 595, 2022.
  15. M. S. Hussain, M. J. Zaki, and D. Subramanian, “Global self-attention as a replacement for graph convolution,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 655–665.
  16. S. Yun, M. Jeong, R. Kim, J. Kang, and H. J. Kim, “Graph transformer networks,” Advances in neural information processing systems, vol. 32, 2019.
  17. Z. Hu, Y. Dong, K. Wang, and Y. Sun, “Heterogeneous graph transformer,” in Proceedings of the web conference 2020, 2020, pp. 2704–2710.
  18. O. Ganea, G. Bécigneul, and T. Hofmann, “Hyperbolic neural networks,” Advances in neural information processing systems, vol. 31, 2018.
  19. V. Khrulkov, L. Mirvakhabova, E. Ustinova, I. Oseledets, and V. Lempitsky, “Hyperbolic image embeddings,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 6418–6428.
  20. Q. Liu, M. Nickel, and D. Kiela, “Hyperbolic graph neural networks,” Advances in neural information processing systems, vol. 32, 2019.
  21. I. Chami, Z. Ying, C. Ré, and J. Leskovec, “Hyperbolic graph convolutional neural networks,” Advances in neural information processing systems, vol. 32, 2019.
  22. Y. Zhang, X. Wang, C. Shi, X. Jiang, and Y. Ye, “Hyperbolic graph attention network,” IEEE Transactions on Big Data, vol. 8, no. 6, pp. 1690–1701, 2021.
  23. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems, I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, Eds., vol. 30.   Curran Associates, Inc., 2017. [Online]. Available: https://proceedings.neurips.cc/paper_files/paper/2017/file/3f5ee243547dee91fbd053c1c4a845aa-Paper.pdf
  24. S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” in International conference on machine learning.   pmlr, 2015, pp. 448–456.
  25. J. L. Ba, J. R. Kiros, and G. E. Hinton, “Layer normalization,” arXiv preprint arXiv:1607.06450, 2016.
  26. K. Zhou, X. Huang, Y. Li, D. Zha, R. Chen, and X. Hu, “Towards deeper graph neural networks with differentiable group normalization,” arXiv preprint arXiv:2006.06972, 2020.
  27. P. Li, Y. Wang, H. Wang, and J. Leskovec, “Distance encoding: Design provably more powerful neural networks for graph representation learning,” Advances in Neural Information Processing Systems, vol. 33, pp. 4465–4478, 2020.
  28. V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, “Benchmarking graph neural networks,” 2020.
  29. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
  30. X. Bresson and T. Laurent, “Residual gated graph convnets,” arXiv preprint arXiv:1711.07553, 2017.
  31. G. Corso, L. Cavalleri, D. Beaini, P. Liò, and P. Veličković, “Principal neighbourhood aggregation for graph nets,” Advances in Neural Information Processing Systems, vol. 33, pp. 13 260–13 271, 2020.
  32. W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, “Open graph benchmark: Datasets for machine learning on graphs,” arXiv preprint arXiv:2005.00687, 2020.
  33. O. Shchur, M. Mumme, A. Bojchevski, and S. Günnemann, “Pitfalls of graph neural network evaluation,” arXiv preprint arXiv:1811.05868, 2018.
  34. K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka, “Representation learning on graphs with jumping knowledge networks,” in International Conference on Machine Learning.   PMLR, 2018, pp. 5453–5462.
  35. M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, “Simple and deep graph convolutional networks,” in International Conference on Machine Learning.   PMLR, 2020, pp. 1725–1735.
  36. G. Li, C. Xiong, A. Thabet, and B. Ghanem, “Deepergcn: All you need to train deeper gcns,” arXiv preprint arXiv:2006.07739, 2020.
  37. G. Bouritsas, F. Frasca, S. Zafeiriou, and M. M. Bronstein, “Improving graph neural network expressivity via subgraph isomorphism counting,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 45, no. 1, pp. 657–668, 2022.
  38. M. Yang, R. Wang, Y. Shen, H. Qi, and B. Yin, “Breaking the expression bottleneck of graph neural networks,” IEEE Transactions on Knowledge and Data Engineering, 2022.
  39. E. Abbe, “Community detection and stochastic block models: Recent developments,” Journal of Machine Learning Research, vol. 18, no. 177, pp. 1–86, 2018. [Online]. Available: http://jmlr.org/papers/v18/16-480.html
  40. D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Kushal Bose (5 papers)
  2. Swagatam Das (48 papers)

Summary

We haven't generated a summary for this paper yet.