Papers
Topics
Authors
Recent
2000 character limit reached

SkipNode: On Alleviating Performance Degradation for Deep Graph Convolutional Networks

Published 22 Dec 2021 in cs.LG | (2112.11628v4)

Abstract: Graph Convolutional Networks (GCNs) suffer from performance degradation when models go deeper. However, earlier works only attributed the performance degeneration to over-smoothing. In this paper, we conduct theoretical and experimental analysis to explore the fundamental causes of performance degradation in deep GCNs: over-smoothing and gradient vanishing have a mutually reinforcing effect that causes the performance to deteriorate more quickly in deep GCNs. On the other hand, existing anti-over-smoothing methods all perform full convolutions up to the model depth. They could not well resist the exponential convergence of over-smoothing due to model depth increasing. In this work, we propose a simple yet effective plug-and-play module, Skipnode, to overcome the performance degradation of deep GCNs. It samples graph nodes in each convolutional layer to skip the convolution operation. In this way, both over-smoothing and gradient vanishing can be effectively suppressed since (1) not all nodes'features propagate through full layers and, (2) the gradient can be directly passed back through ``skipped'' nodes. We provide both theoretical analysis and empirical evaluation to demonstrate the efficacy of Skipnode and its superiority over SOTA baselines.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (53)
  1. J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open, vol. 1, pp. 57–81, 2020.
  2. Y. Yang, Z. Guan, W. Zhao, L. Weigang, and B. Zong, “Graph substructure assembling network with soft sequence and context attention,” TKDE, 2022.
  3. N. T. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” ICLR, 2017.
  4. T. Yao, Y. Pan, Y. Li, and T. Mei, “Exploring visual relationship for image captioning,” in ECCV, September 2018.
  5. L. Li, Z. Gan, Y. Cheng, and J. Liu, “Relation-aware graph attention network for visual question answering,” in ECCV (ICCV), October 2019.
  6. Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in AAAI, vol. 32, no. 1, 2018.
  7. D. Chen, Y. Lin, W. Li, P. Li, J. Zhou, and X. Sun, “Measuring and relieving the over-smoothing problem for graph neural networks from the topological view,” in AAAI, vol. 34, no. 04, 2020, pp. 3438–3445.
  8. Y. Yan, M. Hashemi, K. Swersky, Y. Yang, and D. Koutra, “Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks,” arXiv preprint arXiv:2102.06462, 2021.
  9. G. Wang, R. Ying, J. Huang, and J. Leskovec, “Improving graph attention networks with large margin-based constraints,” arXiv preprint arXiv:1910.11945, 2019.
  10. F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in ICML.   PMLR, 2019, pp. 6861–6871.
  11. C. Yang, R. Wang, S. Yao, S. Liu, and T. Abdelzaher, “Revisiting” over-smoothing” in deep gcns,” arXiv preprint arXiv:2003.13663, 2020.
  12. L. Zhao and L. Akoglu, “Pairnorm: Tackling oversmoothing in gnns,” ICLR, 2020.
  13. K. Oono and T. Suzuki, “Graph neural networks exponentially lose expressive power for node classification,” ICLR, 2020.
  14. C. Cai and Y. Wang, “A note on over-smoothing for graph neural networks,” arXiv preprint arXiv:2006.13318, 2020.
  15. Y. Rong, W. Huang, T. Xu, and J. Huang, “Dropedge: Towards deep graph convolutional networks on node classification,” in ICLR, 2020. [Online]. Available: https://openreview.net/forum?id=Hkx1qkrKPr
  16. J. Klicpera, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” ICLR, 2019.
  17. W. Lu, Z. Guan, W. Zhao, and L. Jin, “Nodemixup: Tackling under-reaching for graph neural networks,” arXiv preprint arXiv:2312.13032, 2023.
  18. M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein, “On the expressive power of deep neural networks,” in ICML.   PMLR, 2017, pp. 2847–2854.
  19. G. Li, M. Muller, A. Thabet, and B. Ghanem, “Deepgcns: Can gcns go as deep as cnns?” in ECCV, 2019, pp. 9267–9276.
  20. K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka, “Representation learning on graphs with jumping knowledge networks,” in ICML.   PMLR, 2018, pp. 5453–5462.
  21. A. Kazi, S. Shekarforoush, S. A. Krishna, H. Burwinkel, G. Vivar, K. Kortüm, S.-A. Ahmadi, S. Albarqouni, and N. Navab, “Inceptiongcn: receptive field aware graph convolutional network for disease prediction,” in IPMI.   Springer, 2019, pp. 73–85.
  22. E. Chien, J. Peng, P. Li, and O. Milenkovic, “Adaptive universal generalized pagerank graph neural network,” in ICLR. https://openreview. net/forum, 2021.
  23. M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, “Simple and deep graph convolutional networks,” in ICML.   PMLR, 2020, pp. 1725–1735.
  24. W. Feng, J. Zhang, Y. Dong, Y. Han, H. Luan, Q. Xu, Q. Yang, E. Kharlamov, and J. Tang, “Graph random neural networks for semi-supervised learning on graphs,” NeurIPS, vol. 33, 2020.
  25. W. Zhang, Z. Sheng, Y. Jiang, Y. Xia, J. Gao, Z. Yang, and B. Cui, “Evaluating deep graph neural networks,” arXiv preprint arXiv:2108.00955, 2021.
  26. M. Liu, H. Gao, and S. Ji, “Towards deeper graph neural networks,” in SIGKDD, 2020, pp. 338–348.
  27. Y. Wang, Y. Wang, J. Yang, and Z. Lin, “Dissecting the diffusion process in linear graph convolutional networks,” NeurIPS, pp. 5758–5769, 2021.
  28. W. Cong, M. Ramezani, and M. Mahdavi, “On provable benefits of depth in training graph convolutional networks,” NeurIPS, pp. 9936–9949, 2021.
  29. K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the ICCV, 2016, pp. 770–778.
  30. T. H. Do, D. M. Nguyen, G. Bekoulis, A. Munteanu, and N. Deligiannis, “Graph convolutional neural networks with node transition probability-based message passing and dropnode regularization,” Expert Systems with Applications, vol. 174, p. 114711, 2021.
  31. N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
  32. T. Fang, Z. Xiao, C. Wang, J. Xu, X. Yang, and Y. Yang, “Dropmessage: Unifying random dropping for graph neural networks,” in AAAI, vol. 37, no. 4, 2023, pp. 4267–4275.
  33. J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks on graphs,” ICLR, 2014.
  34. M. Henaff, J. Bruna, and Y. LeCun, “Deep convolutional networks on graph-structured data,” arXiv preprint arXiv:1506.05163, 2015.
  35. M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” arXiv preprint arXiv:1606.09375, 2016.
  36. L. W. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” NeurIPS 2017, pp. 1024–1034, 2017.
  37. F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein, “Geometric deep learning on graphs and manifolds using mixture model cnns,” in Proceedings of the ICCV, 2017, pp. 5115–5124.
  38. P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” ICLR, 2018.
  39. M. Niepert, M. Ahmed, and K. Kutzkov, “Learning convolutional neural networks for graphs,” in ICML.   PMLR, 2016, pp. 2014–2023.
  40. K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” ICLR, 2019.
  41. J. You, R. Ying, and J. Leskovec, “Position-aware graph neural networks,” in ICML.   PMLR, 2019, pp. 7134–7143.
  42. F. M. Bianchi, D. Grattarola, L. Livi, and C. Alippi, “Graph neural networks with convolutional arma filters,” TPAMI, pp. 1–1, 2021.
  43. S. Abu-El-Haija, A. Kapoor, B. Perozzi, and J. Lee, “N-gcn: Multi-scale graph convolution for semi-supervised node classification,” in UAI.   PMLR, 2020, pp. 841–851.
  44. D. Xu, C. Ruan, E. Korpeoglu, S. Kumar, and K. Achan, “Inductive representation learning on temporal graphs,” arXiv preprint arXiv:2002.07962, 2020.
  45. T. K. Rusch, M. M. Bronstein, and S. Mishra, “A survey on oversmoothing in graph neural networks,” arXiv preprint arXiv:2303.10993, 2023.
  46. K. Zhou, X. Huang, D. Zha, R. Chen, L. Li, S.-H. Choi, and X. Hu, “Dirichlet energy constrained learning for deep graph neural networks,” in NeurIPS, 2021.
  47. X. Miao, W. Zhang, Y. Shao, B. Cui, L. Chen, C. Zhang, and J. Jiang, “Lasagne: A multi-layer graph convolutional network framework via node-aware deep architecture,” TKDE, 2021.
  48. T. K. Rusch, B. Chamberlain, J. Rowbottom, S. Mishra, and M. Bronstein, “Graph-coupled oscillator networks,” in Proceedings of the 39th ICML, ser. Proceedings of Machine Learning Research, vol. 162.   PMLR, 17–23 Jul 2022, pp. 18 888–18 909.
  49. Y. Wang, K. Yi, X. Liu, Y. G. Wang, and S. Jin, “Acmp: Allen-cahn message passing for graph neural networks with particle phase transition,” arXiv preprint arXiv:2206.05437, 2022.
  50. Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
  51. P. Erdös and A. Rényi, “On random graphs I,” Publicationes Mathematicae (Debrecen), vol. 6, pp. 290–297, 1959.
  52. E. N. Gilbert, “Random graphs,” The Annals of Mathematical Statistics, vol. 30, no. 4, pp. 1141–1144, 1959.
  53. Z. Yang, W. Cohen, and R. Salakhudinov, “Revisiting semi-supervised learning with graph embeddings,” in ICML.   PMLR, 2016, pp. 40–48.
Citations (8)

Summary

We haven't generated a summary for this paper yet.

Whiteboard

Paper to Video (Beta)

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.