SkipNode: On Alleviating Performance Degradation for Deep Graph Convolutional Networks
Abstract: Graph Convolutional Networks (GCNs) suffer from performance degradation when models go deeper. However, earlier works only attributed the performance degeneration to over-smoothing. In this paper, we conduct theoretical and experimental analysis to explore the fundamental causes of performance degradation in deep GCNs: over-smoothing and gradient vanishing have a mutually reinforcing effect that causes the performance to deteriorate more quickly in deep GCNs. On the other hand, existing anti-over-smoothing methods all perform full convolutions up to the model depth. They could not well resist the exponential convergence of over-smoothing due to model depth increasing. In this work, we propose a simple yet effective plug-and-play module, Skipnode, to overcome the performance degradation of deep GCNs. It samples graph nodes in each convolutional layer to skip the convolution operation. In this way, both over-smoothing and gradient vanishing can be effectively suppressed since (1) not all nodes'features propagate through full layers and, (2) the gradient can be directly passed back through ``skipped'' nodes. We provide both theoretical analysis and empirical evaluation to demonstrate the efficacy of Skipnode and its superiority over SOTA baselines.
- J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open, vol. 1, pp. 57–81, 2020.
- Y. Yang, Z. Guan, W. Zhao, L. Weigang, and B. Zong, “Graph substructure assembling network with soft sequence and context attention,” TKDE, 2022.
- N. T. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” ICLR, 2017.
- T. Yao, Y. Pan, Y. Li, and T. Mei, “Exploring visual relationship for image captioning,” in ECCV, September 2018.
- L. Li, Z. Gan, Y. Cheng, and J. Liu, “Relation-aware graph attention network for visual question answering,” in ECCV (ICCV), October 2019.
- Q. Li, Z. Han, and X.-M. Wu, “Deeper insights into graph convolutional networks for semi-supervised learning,” in AAAI, vol. 32, no. 1, 2018.
- D. Chen, Y. Lin, W. Li, P. Li, J. Zhou, and X. Sun, “Measuring and relieving the over-smoothing problem for graph neural networks from the topological view,” in AAAI, vol. 34, no. 04, 2020, pp. 3438–3445.
- Y. Yan, M. Hashemi, K. Swersky, Y. Yang, and D. Koutra, “Two sides of the same coin: Heterophily and oversmoothing in graph convolutional neural networks,” arXiv preprint arXiv:2102.06462, 2021.
- G. Wang, R. Ying, J. Huang, and J. Leskovec, “Improving graph attention networks with large margin-based constraints,” arXiv preprint arXiv:1910.11945, 2019.
- F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in ICML. PMLR, 2019, pp. 6861–6871.
- C. Yang, R. Wang, S. Yao, S. Liu, and T. Abdelzaher, “Revisiting” over-smoothing” in deep gcns,” arXiv preprint arXiv:2003.13663, 2020.
- L. Zhao and L. Akoglu, “Pairnorm: Tackling oversmoothing in gnns,” ICLR, 2020.
- K. Oono and T. Suzuki, “Graph neural networks exponentially lose expressive power for node classification,” ICLR, 2020.
- C. Cai and Y. Wang, “A note on over-smoothing for graph neural networks,” arXiv preprint arXiv:2006.13318, 2020.
- Y. Rong, W. Huang, T. Xu, and J. Huang, “Dropedge: Towards deep graph convolutional networks on node classification,” in ICLR, 2020. [Online]. Available: https://openreview.net/forum?id=Hkx1qkrKPr
- J. Klicpera, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” ICLR, 2019.
- W. Lu, Z. Guan, W. Zhao, and L. Jin, “Nodemixup: Tackling under-reaching for graph neural networks,” arXiv preprint arXiv:2312.13032, 2023.
- M. Raghu, B. Poole, J. Kleinberg, S. Ganguli, and J. Sohl-Dickstein, “On the expressive power of deep neural networks,” in ICML. PMLR, 2017, pp. 2847–2854.
- G. Li, M. Muller, A. Thabet, and B. Ghanem, “Deepgcns: Can gcns go as deep as cnns?” in ECCV, 2019, pp. 9267–9276.
- K. Xu, C. Li, Y. Tian, T. Sonobe, K.-i. Kawarabayashi, and S. Jegelka, “Representation learning on graphs with jumping knowledge networks,” in ICML. PMLR, 2018, pp. 5453–5462.
- A. Kazi, S. Shekarforoush, S. A. Krishna, H. Burwinkel, G. Vivar, K. Kortüm, S.-A. Ahmadi, S. Albarqouni, and N. Navab, “Inceptiongcn: receptive field aware graph convolutional network for disease prediction,” in IPMI. Springer, 2019, pp. 73–85.
- E. Chien, J. Peng, P. Li, and O. Milenkovic, “Adaptive universal generalized pagerank graph neural network,” in ICLR. https://openreview. net/forum, 2021.
- M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, “Simple and deep graph convolutional networks,” in ICML. PMLR, 2020, pp. 1725–1735.
- W. Feng, J. Zhang, Y. Dong, Y. Han, H. Luan, Q. Xu, Q. Yang, E. Kharlamov, and J. Tang, “Graph random neural networks for semi-supervised learning on graphs,” NeurIPS, vol. 33, 2020.
- W. Zhang, Z. Sheng, Y. Jiang, Y. Xia, J. Gao, Z. Yang, and B. Cui, “Evaluating deep graph neural networks,” arXiv preprint arXiv:2108.00955, 2021.
- M. Liu, H. Gao, and S. Ji, “Towards deeper graph neural networks,” in SIGKDD, 2020, pp. 338–348.
- Y. Wang, Y. Wang, J. Yang, and Z. Lin, “Dissecting the diffusion process in linear graph convolutional networks,” NeurIPS, pp. 5758–5769, 2021.
- W. Cong, M. Ramezani, and M. Mahdavi, “On provable benefits of depth in training graph convolutional networks,” NeurIPS, pp. 9936–9949, 2021.
- K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in Proceedings of the ICCV, 2016, pp. 770–778.
- T. H. Do, D. M. Nguyen, G. Bekoulis, A. Munteanu, and N. Deligiannis, “Graph convolutional neural networks with node transition probability-based message passing and dropnode regularization,” Expert Systems with Applications, vol. 174, p. 114711, 2021.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: a simple way to prevent neural networks from overfitting,” The journal of machine learning research, vol. 15, no. 1, pp. 1929–1958, 2014.
- T. Fang, Z. Xiao, C. Wang, J. Xu, X. Yang, and Y. Yang, “Dropmessage: Unifying random dropping for graph neural networks,” in AAAI, vol. 37, no. 4, 2023, pp. 4267–4275.
- J. Bruna, W. Zaremba, A. Szlam, and Y. LeCun, “Spectral networks and locally connected networks on graphs,” ICLR, 2014.
- M. Henaff, J. Bruna, and Y. LeCun, “Deep convolutional networks on graph-structured data,” arXiv preprint arXiv:1506.05163, 2015.
- M. Defferrard, X. Bresson, and P. Vandergheynst, “Convolutional neural networks on graphs with fast localized spectral filtering,” arXiv preprint arXiv:1606.09375, 2016.
- L. W. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” NeurIPS 2017, pp. 1024–1034, 2017.
- F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein, “Geometric deep learning on graphs and manifolds using mixture model cnns,” in Proceedings of the ICCV, 2017, pp. 5115–5124.
- P. Velickovic, G. Cucurull, A. Casanova, A. Romero, P. Liò, and Y. Bengio, “Graph attention networks,” ICLR, 2018.
- M. Niepert, M. Ahmed, and K. Kutzkov, “Learning convolutional neural networks for graphs,” in ICML. PMLR, 2016, pp. 2014–2023.
- K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” ICLR, 2019.
- J. You, R. Ying, and J. Leskovec, “Position-aware graph neural networks,” in ICML. PMLR, 2019, pp. 7134–7143.
- F. M. Bianchi, D. Grattarola, L. Livi, and C. Alippi, “Graph neural networks with convolutional arma filters,” TPAMI, pp. 1–1, 2021.
- S. Abu-El-Haija, A. Kapoor, B. Perozzi, and J. Lee, “N-gcn: Multi-scale graph convolution for semi-supervised node classification,” in UAI. PMLR, 2020, pp. 841–851.
- D. Xu, C. Ruan, E. Korpeoglu, S. Kumar, and K. Achan, “Inductive representation learning on temporal graphs,” arXiv preprint arXiv:2002.07962, 2020.
- T. K. Rusch, M. M. Bronstein, and S. Mishra, “A survey on oversmoothing in graph neural networks,” arXiv preprint arXiv:2303.10993, 2023.
- K. Zhou, X. Huang, D. Zha, R. Chen, L. Li, S.-H. Choi, and X. Hu, “Dirichlet energy constrained learning for deep graph neural networks,” in NeurIPS, 2021.
- X. Miao, W. Zhang, Y. Shao, B. Cui, L. Chen, C. Zhang, and J. Jiang, “Lasagne: A multi-layer graph convolutional network framework via node-aware deep architecture,” TKDE, 2021.
- T. K. Rusch, B. Chamberlain, J. Rowbottom, S. Mishra, and M. Bronstein, “Graph-coupled oscillator networks,” in Proceedings of the 39th ICML, ser. Proceedings of Machine Learning Research, vol. 162. PMLR, 17–23 Jul 2022, pp. 18 888–18 909.
- Y. Wang, K. Yi, X. Liu, Y. G. Wang, and S. Jin, “Acmp: Allen-cahn message passing for graph neural networks with particle phase transition,” arXiv preprint arXiv:2206.05437, 2022.
- Y. Bengio, P. Simard, and P. Frasconi, “Learning long-term dependencies with gradient descent is difficult,” IEEE transactions on neural networks, vol. 5, no. 2, pp. 157–166, 1994.
- P. Erdös and A. Rényi, “On random graphs I,” Publicationes Mathematicae (Debrecen), vol. 6, pp. 290–297, 1959.
- E. N. Gilbert, “Random graphs,” The Annals of Mathematical Statistics, vol. 30, no. 4, pp. 1141–1144, 1959.
- Z. Yang, W. Cohen, and R. Salakhudinov, “Revisiting semi-supervised learning with graph embeddings,” in ICML. PMLR, 2016, pp. 40–48.
Sponsor
Paper Prompts
Sign up for free to create and run prompts on this paper using GPT-5.
Top Community Prompts
Collections
Sign up for free to add this paper to one or more collections.