DropEdge: Towards Deep Graph Convolutional Networks on Node Classification (1907.10903v4)

Published 25 Jul 2019 in cs.LG, cs.NI, and stat.ML

Abstract: \emph{Over-fitting} and \emph{over-smoothing} are two main obstacles of developing deep Graph Convolutional Networks (GCNs) for node classification. In particular, over-fitting weakens the generalization ability on small dataset, while over-smoothing impedes model training by isolating output representations from the input features with the increase in network depth. This paper proposes DropEdge, a novel and flexible technique to alleviate both issues. At its core, DropEdge randomly removes a certain number of edges from the input graph at each training epoch, acting like a data augmenter and also a message passing reducer. Furthermore, we theoretically demonstrate that DropEdge either reduces the convergence speed of over-smoothing or relieves the information loss caused by it. More importantly, our DropEdge is a general skill that can be equipped with many other backbone models (e.g. GCN, ResGCN, GraphSAGE, and JKNet) for enhanced performance. Extensive experiments on several benchmarks verify that DropEdge consistently improves the performance on a variety of both shallow and deep GCNs. The effect of DropEdge on preventing over-smoothing is empirically visualized and validated as well. Codes are released on~\url{https://github.com/DropEdge/DropEdge}.

Authors (4)

Yu Rong (146 papers)
Wenbing Huang (95 papers)
Tingyang Xu (55 papers)
Junzhou Huang (137 papers)

Citations (1,228)

View on Semantic Scholar

Summary

The paper presents DropEdge, a technique that randomly drops graph edges to mitigate overfitting and over-smoothing in deep GCNs.
It introduces a layer-wise variant that applies edge dropping to each layer, enhancing convergence speed and reducing information loss.
Empirical results show up to a 13.5% accuracy boost on Citeseer, highlighting its effectiveness in training deeper GCN architectures.

DropEdge: Towards Deep Graph Convolutional Networks on Node Classification

The paper "DropEdge: Towards Deep Graph Convolutional Networks on Node Classification" introduces a targeted approach to improve the performance of deep Graph Convolutional Networks (GCNs). Recognizing over-fitting and over-smoothing as primary barriers to effective deep GCN training, the authors propose DropEdge, a technique that randomly drops edges from the graph during training.

Methodological Innovations

DropEdge Methodology: DropEdge operates by randomly removing a percentage of edges from the graph during each training epoch. This random edge removal serves a dual purpose. Firstly, it acts as a data augmentation method, increasing the diversity of input data by creating various deformed graph copies. Secondly, it serves as a message-passing reducer by sparsifying node connections, thereby alleviating the over-smoothing problem inherent in deeper GCNs.

Layer-Wise DropEdge: The paper also presents a variant named Layer-Wise DropEdge, wherein the edge-dropping procedure is independently applied to each layer. This adds further randomness and training regularization, though it bears a higher computational cost.

Empirical and Theoretical Contributions

Theoretical Insights: The authors provide a rigorous theoretical backing for DropEdge’s effectiveness. They demonstrate that DropEdge either increases the convergence speed to prevent over-smoothing or lessens the information loss by expanding the converging subspace dimension. These theoretical results hinge on the eigenspectrum properties of the graph's adjacency matrix, indicating that the relaxed smoothing layer (a measure of how quickly over-smoothing occurs) either increases or the information loss is mitigated.

Empirical Evaluation: Extensive experiments on benchmarks such as Cora, Citeseer, Pubmed, and Reddit showcase that DropEdge generally improves performance across multiple GCN architectures (e.g., GCN, ResGCN, JKNet, IncepGCN, and GraphSAGE). Notably:

DropEdge substantially reduces over-fitting, as evidenced by significantly lower validation losses in deep graph convolutions.
The technique enables deep GCNs to combat over-smoothing, thereby preserving the meaningful representation of nodes in deeper networks.
The augmentation and message-passing reductions introduced by DropEdge result in an overall performance uptick, with models achieving paper-best results in some cases.

Numerical Results

Performance Gains: The paper reports that on Citeseer, DropEdge achieves a 13.5% absolute accuracy improvement for models with 64 layers, highlighting its efficiency in deep regimes. Across other datasets and depths, similar trends are observed, illustrating consistent improvements in both shallow and deep GCNs.

Ablation Studies: Comparative studies reveal the synergetic effect of combining DropEdge with Dropout on node features, further enhancing model performance by reducing both over-fitting and over-smoothing. Additionally, the layer-wise DropEdge variant achieves superior training loss reduction, though it only marginally impacts validation performance.

Implications and Future Directions

Practical Implications: This paper demonstrates how DropEdge can transform GCN training, enabling the development of deep, more expressive models without succumbing to the typical pitfalls of over-fitting and over-smoothing. The application of DropEdge across various GCN architectures broadens its utility, suggesting its potential integration into future GCN-based systems.

Theoretical Implications: The theoretical groundwork established here for DropEdge could catalyze further research into graph sparsification methods tailored for specific GNN tasks. The concepts and proofs concerning over-smoothing convergence layers could form a basis for new regularization techniques in deep learning on graphs.

Future Research: Building on this, future work could explore:

Adaptive DropEdge mechanisms that dynamically adjust edge-dropping rates during training.
Exploration of DropEdge in different GNN models beyond GCNs, such as attention-based GNNs and spectral GNNs.
Real-world application assessments of DropEdge in large-scale graph-structured data domains like social networks and biological networks.

Conclusion

DropEdge presents a balanced approach to enhancing deep GCN training by addressing fundamental obstacles through a theoretically grounded and empirically validated framework. The detailed analyses and extensive experiments reinforce its utility in advancing GCN research and applications.

Related Papers

GitHub

GitHub - DropEdge/DropEdge: This is a Pytorch implementation of paper: DropEdge: Towards Deep Graph Convolutional Networks on Node Classification (470 stars)