DropMessage: Unifying Random Dropping for Graph Neural Networks (2204.10037v3)
Abstract: Graph Neural Networks (GNNs) are powerful tools for graph representation learning. Despite their rapid development, GNNs also face some challenges, such as over-fitting, over-smoothing, and non-robustness. Previous works indicate that these problems can be alleviated by random dropping methods, which integrate augmented data into models by randomly masking parts of the input. However, some open problems of random dropping on GNNs remain to be solved. First, it is challenging to find a universal method that are suitable for all cases considering the divergence of different datasets and models. Second, augmented data introduced to GNNs causes the incomplete coverage of parameters and unstable training process. Third, there is no theoretical analysis on the effectiveness of random dropping methods on GNNs. In this paper, we propose a novel random dropping method called DropMessage, which performs dropping operations directly on the propagated messages during the message-passing process. More importantly, we find that DropMessage provides a unified framework for most existing random dropping methods, based on which we give theoretical analysis of their effectiveness. Furthermore, we elaborate the superiority of DropMessage: it stabilizes the training process by reducing sample variance; it keeps information diversity from the perspective of information theory, enabling it become a theoretical upper bound of other methods. To evaluate our proposed method, we conduct experiments that aims for multiple tasks on five public datasets and two industrial datasets with various backbone models. The experimental results show that DropMessage has the advantages of both effectiveness and generalization, and can significantly alleviate the problems mentioned above.
- Abu-Mostafa, Y. S. 1990. Learning from hints in neural networks. Journal of complexity, 6(2): 192–198.
- Bishop, C. M. 1995. Training with Noise is Equivalent to Tikhonov Regularization. Neural Computation, 7: 108–116.
- Improving the Accuracy and Speed of Support Vector Machines. In NIPS.
- Albumentations: fast and flexible image augmentations. ArXiv, abs/1809.06839.
- Graph coarsening with neural networks. arXiv preprint arXiv:2102.01350.
- Measuring and Relieving the Over-smoothing Problem for Graph Neural Networks from the Topological View. In AAAI.
- FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling. ICLR.
- Certified Adversarial Robustness via Randomized Smoothing. In ICML.
- Nrgnn: Learning a label noise resistant graph neural network on sparsely and noisily labeled graphs. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, 227–236.
- Convolutional neural networks on graphs with fast localized spectral filtering.
- Meta propagation networks for graph few-shot semi-supervised learning. AAAI.
- Data Augmentation for Deep Graph Learning: A Survey. ArXiv, abs/2202.08235.
- Data augmentation for deep graph learning: A survey. arXiv preprint arXiv:2202.08235.
- Semi-supervised learning on graphs with generative adversarial nets. In CIKM, 913–922.
- Addressing Over-Smoothing in Graph Neural Networks via Deep Supervision. ArXiv, abs/2202.12508.
- Graph Random Neural Networks for Semi-Supervised Learning on Graphs. NeurIPS, 33.
- GAN-based Synthetic Medical Image Augmentation for increased CNN Performance in Liver Lesion Classification. Neurocomputing, 321: 321–331.
- Neural message passing for quantum chemistry. In ICML, 1263–1272.
- Improving neural networks by preventing co-adaptation of feature detectors. ArXiv, abs/1207.0580.
- Open Graph Benchmark: Datasets for Machine Learning on Graphs. arXiv preprint arXiv:2005.00687.
- Graph condensation for graph neural networks. arXiv preprint arXiv:2110.07580.
- Variational graph auto-encoders. In ArXiv, volume abs/1611.07308.
- Semi-supervised classification with graph convolutional networks. In International Conference on Learning Representations.
- Predict then Propagate: Graph Neural Networks meet Personalized PageRank. In ICLR.
- Deeper Insights into Graph Convolutional Networks for Semi-Supervised Learning. In AAAI.
- Towards Deeper Graph Neural Networks. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. ACM.
- Learning to Drop: Robust Graph Neural Network via Topological Denoising. Proceedings of the 14th ACM International Conference on Web Search and Data Mining.
- Learning with Marginalized Corrupted Features. In ICML.
- Matsuoka, K. 1992. Noise injection into inputs in back-propagation learning. IEEE Trans. Syst. Man Cybern., 22: 436–440.
- Simple and Deep Graph Convolutional Networks.
- On Asymptotic Behaviors of Graph CNNs from Dynamical Systems Perspective. ArXiv, abs/1905.10947.
- Graph Neural Networks Exponentially Lose Expressive Power for Node Classification. arXiv: Learning.
- DropGNN: Random Dropouts Increase the Expressiveness of Graph Neural Networks. In NeurIPS.
- A distributed approach to node clustering in decentralized peer-to-peer networks. In TPDS, volume 16, 814–829.
- The Manifold Tangent Classifier. In NIPS.
- Dropedge: Towards deep graph convolutional networks on node classification. In ICLR.
- Collective classification in network data. volume 29, 93–93.
- Shannon, C. E. 2001. A mathematical theory of communication. Bell Syst. Tech. J., 27: 623–656.
- A survey on Image Data Augmentation for Deep Learning. Journal of Big Data, 6: 1–48.
- Transformation invariance in pattern recognition—tangent distance and tangent propagation. In Neural networks: tricks of the trade, 239–274. Springer.
- Multi-stage self-supervised learning for graph convolutional networks on graphs with few labeled nodes. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 5892–5899.
- Graph Attention Networks. In International Conference on Learning Representations.
- Deep Graph Infomax. ICLR (Poster), 2(3): 4.
- Deep Graph Infomax. ICLR.
- Manifold mixup: Better representations by interpolating hidden states. In International Conference on Machine Learning, 6438–6447. PMLR.
- Dropout Training as Adaptive Regularization. In NIPS.
- Regularization of Neural Networks using DropConnect. In ICML.
- MoleculeNet: A Benchmark for Molecular Machine Learning. arXiv: Learning.
- Topology Attack and Defense for Graph Neural Networks: An Optimization Perspective. In International Joint Conference on Artificial Intelligence (IJCAI).
- Representation Learning on Graphs with Jumping Knowledge Networks. In ICML.
- Mining Fraudsters and Fraudulent Strategies in Large-Scale Mobile Social Networks. IEEE Transactions on Knowledge and Data Engineering, 33: 169–179.
- Understanding Default Behavior in Online Lending. Proceedings of the 28th ACM International Conference on Information and Knowledge Management.
- Graph convolutional neural networks for web-scale recommender systems. In SIGKDD, 974–983.
- GraphSAINT: Graph Sampling Based Inductive Learning Method. ArXiv, abs/1907.04931.
- mixup: Beyond empirical risk minimization. arXiv preprint arXiv:1710.09412.
- PairNorm: Tackling Oversmoothing in GNNs. ArXiv, abs/1909.12223.
- Data augmentation for graph neural networks. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 11015–11023.
- Robust graph convolutional networks against adversarial attacks. In SIGKDD, 1399–1407.
- Adversarial Attacks on Neural Networks for Graph Data. Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining.