Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ATNPA: A Unified View of Oversmoothing Alleviation in Graph Neural Networks (2405.01663v1)

Published 2 May 2024 in cs.LG and cs.AI

Abstract: Oversmoothing is a commonly observed challenge in graph neural network (GNN) learning, where, as layers increase, embedding features learned from GNNs quickly become similar/indistinguishable, making them incapable of differentiating network proximity. A GNN with shallow layer architectures can only learn short-term relation or localized structure information, limiting its power of learning long-term connection, evidenced by their inferior learning performance on heterophilous graphs. Tackling oversmoothing is crucial to harness deep-layer architectures for GNNs. To date, many methods have been proposed to alleviate oversmoothing. The vast difference behind their design principles, combined with graph complications, make it difficult to understand and even compare their difference in tackling the oversmoothing. In this paper, we propose ATNPA, a unified view with five key steps: Augmentation, Transformation, Normalization, Propagation, and Aggregation, to summarize GNN oversmoothing alleviation approaches. We first outline three themes to tackle oversmoothing, and then separate all methods into six categories, followed by detailed reviews of representative methods, including their relation to the ATNPA, and discussion about their niche, strength, and weakness. The review not only draws in-depth understanding of existing methods in the field, but also shows a clear road map for future study.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. M. Gori, G. Monfardini, and F. Scarselli, “A new model for learning in graph domains,” in IEEE International Joint Conference on Neural Networks (IJCNN), vol. 2, pp. 729–734 vol. 2, 2005.
  2. J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks on graphs,” in International Conference on Learning Representations (ICLR), 2014.
  3. M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” in 30th International Conference on Neural Information Processing Systems (NIPS), 2016.
  4. T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings, OpenReview.net, 2017.
  5. D. Zhang, J. Yin, X. Zhu, and C. Zhang, “Network representation learning: A survey,” IEEE Transactions on Big Data, vol. 6, pp. 3–28, 2017.
  6. J. Zhu, R. A. Rossi, A. Rao, T. Mai, N. Lipka, N. K. Ahmed, and D. Koutra, “Graph neural networks with heterophily,” in Proceedings of the 34th AAAI Conference on Artificial Intelligence (AAAI), 2021.
  7. Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence (AAAI), 2018.
  8. H. NT and T. Maehara, “Revisiting graph neural networks: All we have is low-pass filters,” ArXiv, vol. abs/1905.09550, 2019.
  9. U. Alon and E. Yahav, “On the bottleneck of graph neural networks and its practical implications,” in International Conference on Learning Representations (ICLR), 2021.
  10. Y. Yan, M. Hashemi, K. Swersky, Y. Yang, and D. Koutra, “Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks,” 2022 IEEE International Conference on Data Mining (ICDM), pp. 1287–1292, 2021.
  11. T. K. Rusch, B. P. Chamberlain, M. W. Mahoney, M. M. Bronstein, and S. Mishra, “Gradient gating for deep multi-rate learning on graphs,” in International Conference on Learning Representations, 2023.
  12. T. K. Rusch, B. P. Chamberlain, J. R. Rowbottom, S. Mishra, and M. M. Bronstein, “Graph-coupled oscillator networks,” in International Conference on Machine Learning, 2022.
  13. D. Chen, Y. Lin, W. Li, P. Li, J. Zhou, and X. Sun, “Measuring and relieving the over-smoothing problem for graph neural networks from the topological view,” in Proceddings of the 34th AAAI Conference on Artificial Intelligence (AAAI-20), 2019.
  14. K. Oono and T. Suzuki, “Graph neural networks exponentially lose expressive power for node classification,” in International Conference on Learning Representations (ICLR), 2020.
  15. K. Zhou, X. Huang, D. Zha, R. Chen, L. Li, S.-H. Choi, and X. Hu, “Dirichlet energy constrained learning for deep graph neural networks,” Advances in neural information processing systems, 2021.
  16. Y. Rong, W. Huang, T. Xu, and J. Huang, “Dropedge: Towards deep graph convolutional networks on node classification,” in International Conference on Learning Representations, 2020.
  17. X. Wu, A. Ajorlou, Z. Wu, and A. Jadbabaie, “Demystifying oversmoothing in attention-based graph neural networks,” in Proceedings of the 37th Conference on Neural Information Processing Systems (NeurIPS), 2023.
  18. P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” in International Conference on Learning Representations, 2018.
  19. T. K. Rusch, M. M. Bronstein, and S. Mishra, “A survey on oversmoothing in graph neural networks,” arXiv:2303.10993, 2023.
  20. M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, “Simple and deep graph convolutional networks,” in Proceedings of the 37th International Conference on Machine Learning (H. D. III and A. Singh, eds.), vol. 119 of Proceedings of Machine Learning Research, pp. 1725–1735, PMLR, 13–18 Jul 2020.
  21. F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in Proceedings of the 36th International Conference on Machine Learning (K. Chaudhuri and R. Salakhutdinov, eds.), vol. 97 of Proceedings of Machine Learning Research, pp. 6861–6871, PMLR, 09–15 Jun 2019.
  22. S. Ioffe and C. Szegedy, “Batch normalization: accelerating deep network training by reducing internal covariate shift,” in Proceedings of the 32nd International Conference on International Conference on Machine Learning - Volume 37, ICML’15, p. 448–456, JMLR.org, 2015.
  23. L. Zhao and L. Akoglu, “Pairnorm: Tackling oversmoothing in gnns,” in International Conference on Learning Representations (ICLR), 2020.
  24. K. Zhou, Y. Dong, K. Wang, W. S. Lee, B. Hooi, H. Xu, and J. Feng, “Understanding and resolving performance degradation in deep graph convolutional networks,” Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 2020.
  25. K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka, “Representation learning on graphs with jumping knowledge networks,” in Proceedings of the 35th International Conference on Machine Learning (J. Dy and A. Krause, eds.), vol. 80 of Proceedings of Machine Learning Research, pp. 5453–5462, PMLR, 10–15 Jul 2018.
  26. V. P. Dwivedi and X. Bresson, “A generalization of transformer networks to graphs,” AAAI Workshop on Deep Learning on Graphs: Methods and Applications, 2021.
  27. B. P. Chamberlain, J. Rowbottom, M. I. Gorinova, S. D. Webb, E. Rossi, and M. M. Bronstein, “GRAND: Graph neural diffusion,” in The Symbiosis of Deep Learning and Differential Equations, 2021.
  28. A. Leman, “The reduction of a graph to canonical form and the algebra which appears therein,” 2018.
  29. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?,” in International Conference on Learning Representations, 2019.
  30. Z. Chen, S. Villar, L. Chen, and J. Bruna, “On the equivalence between graph isomorphism testing and function approximation with gnns,” in Advances in Neural Information Processing Systems (H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett, eds.), vol. 32, Curran Associates, Inc., 2019.
  31. A. Wijesinghe and Q. Wang, “A new perspective on ”how graph neural networks go beyond weisfeiler-lehman?”,” in International Conference on Learning Representations, 2022.
  32. G. Li, M. Müller, A. Thabet, and B. Ghanem, “Deepgcns: Can gcns go as deep as cnns?,” in The IEEE International Conference on Computer Vision (ICCV), 2019.
  33. J. Klicpera, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” in International Conference on Learning Representations, 2018.
  34. G. Li, C. Xiong, A. K. Thabet, and B. Ghanem, “Deepergcn: All you need to train deeper gcns,” ArXiv, vol. abs/2006.07739, 2020.
  35. K. Zhou, X. Huang, Y. Li, D. Zha, R. Chen, and X. Hu, “Towards deeper graph neural networks with differentiable group normalization,” in Advances in neural information processing systems, 2020.
  36. M. Liu, H. Gao, and S. Ji, “Towards deeper graph neural networks,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, ACM, 2020.
  37. Z. Guo1, Y. Zhang, Z. Teng, and W. Lu, “Densely connected graph convolutional networks for graph-to-sequence learning,” Transactions of the Association for Computational Linguistics, vol. 7, pp. 297–312, 2019.
  38. S. Abu-El-Haija, B. Perozzi, A. Kapoor, H. Harutyunyan, N. Alipourfard, K. Lerman, G. V. Steeg, and A. Galstyan, “Mixhop: Higher-order graph convolution architectures via sparsified neighborhood mixing,” in International Conference on Machine Learning (ICML), 2019.
  39. A. Hasanzadeh, E. Hajiramezanali, S. Boluki, M. Zhou, N. Duffield, K. Narayanan, and X. Qian, “Bayesian graph neural networks with adaptive connection sampling,” in Proceedings of the 37th International Conference on Machine Learning (H. D. III and A. Singh, eds.), vol. 119 of Proceedings of Machine Learning Research, pp. 4094–4104, PMLR, 13–18 Jul 2020.
  40. T. Fang, Z. Xiao, C. Wang, J. Xu, X. Yang, and Y. Yang, “Dropmessage: Unifying random dropping for graph neural networks,” Proceedings of the AAAI Conference on Artificial Intelligence, vol. 37, p. 4267–4275, June 2023.
  41. Y. Wang, K. Yi, X. Liu, Y. G. Wang, and S. Jin, “ACMP: Allen-cahn message passing with attractive and repulsive forces for graph neural networks,” in The Eleventh International Conference on Learning Representations, 2023.
  42. C. Bodnar, F. D. Giovanni, B. P. Chamberlain, P. Liò, and M. M. Bronstein, “Neural sheaf diffusion: A topological perspective on heterophily and oversmoothing in gnns,” in 36th Conferenceon Neural Information Processing Systems (NeurIPS)., 2022.
  43. G. Mialon, D. Chen, M. Selosse, and J. Mairal, “Graphit: Encoding graph structure in transformers,” arXiv:2106.05667, 2021.
  44. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 770–778, 2015.
  45. Y. Min, F. Wenke, and G. Wolf, “Scattering gcn: Overcoming oversmoothness in graph convolutional networks,” in Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS2020), 2020.
  46. E. Min, R. Chen, Y. Bian, T. Xu, K. Zhao, W. Huang, P. Zhao, J. Huang, S. Ananiadou, and Y. Rong, “Transformer for graphs: An overview from architecture perspective,” ArXiv, vol. abs/2202.08455, 2022.
  47. Z. Wu, P. Jain, M. Wright, A. Mirhoseini, J. E. Gonzalez, and I. Stoica, “Representing long-range context for graph neural networks with global attention,” in Advances in Neural Information Processing Systems (NeurIPS), 2021.
  48. J. Zhang, H. Zhang, C. Xia, and L. Sun, “Graph-bert: Only attention is needed for learning graph representations,” arXiv preprint arXiv:2001.05140, 2020.
  49. C. Ying, T. Cai, S. Luo, S. Zheng, G. Ke, D. He, Y. Shen, and T.-Y. Liu, “Do transformers really perform badly for graph representation?,” in Advances in Neural Information Processing Systems (A. Beygelzimer, Y. Dauphin, P. Liang, and J. W. Vaughan, eds.), 2021.
  50. J. Gasteiger, S. Weißenberger, and S. Günnemann, “Diffusion improves graph learning,” in Conference on Neural Information Processing Systems (NeurIPS), 2019.
  51. 01 1993.
  52. Y. Rong, Y. Bian, T. Xu, W. Xie, Y. Wei, W. Huang, and J. Huang, “Self-supervised graph transformer on large-scale molecular data,” Advances in Neural Information Processing Systems, vol. 33, 2020.
  53. S. Luan, C. Hua, Q. Lu, J. Zhu, M. Zhao, S. Zhang, X.-W. Chang, and D. Precup, “Is heterophily a real nightmare for graph neural networks to do node classification?,” arXiv:2109.05641, 2021.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (2)
  1. Yufei Jin (5 papers)
  2. Xingquan Zhu (36 papers)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com