E2GNN: Efficient Graph Neural Network Ensembles for Semi-Supervised Classification (2405.03401v1)
Abstract: This work studies ensemble learning for graph neural networks (GNNs) under the popular semi-supervised setting. Ensemble learning has shown superiority in improving the accuracy and robustness of traditional machine learning by combining the outputs of multiple weak learners. However, adopting a similar idea to integrate different GNN models is challenging because of two reasons. First, GNN is notorious for its poor inference ability, so naively assembling multiple GNN models would deteriorate the inference efficiency. Second, when GNN models are trained with few labeled nodes, their performance are limited. In this case, the vanilla ensemble approach, e.g., majority vote, may be sub-optimal since most base models, i.e., GNNs, may make the wrong predictions. To this end, in this paper, we propose an efficient ensemble learner--E2GNN to assemble multiple GNNs in a learnable way by leveraging both labeled and unlabeled nodes. Specifically, we first pre-train different GNN models on a given data scenario according to the labeled nodes. Next, instead of directly combing their outputs for label inference, we train a simple multi-layer perceptron--MLP model to mimic their predictions on both labeled and unlabeled nodes. Then the unified MLP model is deployed to infer labels for unlabeled or new nodes. Since the predictions of unlabeled nodes from different GNN models may be incorrect, we develop a reinforced discriminator to effectively filter out those wrongly predicted nodes to boost the performance of MLP. By doing this, we suggest a principled approach to tackle the inference issues of GNN ensembles and maintain the merit of ensemble learning: improved performance. Comprehensive experiments over both transductive and inductive settings, across different GNN backbones and 8 benchmark datasets, demonstrate the superiority of E2GNN.
- G. Namata, B. London, L. Getoor, B. Huang, and U. Edu, “Query-driven active surveying for collective classification,” in 10th International Workshop on Mining and Learning with Graphs, vol. 8, 2012, p. 1.
- H. Pei, B. Wei, K. C.-C. Chang, Y. Lei, and B. Yang, “Geom-gcn: Geometric graph convolutional networks,” arXiv preprint arXiv:2002.05287, 2020.
- J. Leskovec and J. Mcauley, “Learning to discover social circles in ego networks,” Advances in neural information processing systems, vol. 25, 2012.
- K. M. Borgwardt, C. S. Ong, S. Schönauer, S. Vishwanathan, A. J. Smola, and H.-P. Kriegel, “Protein function prediction via graph kernels,” Bioinformatics, vol. 21, no. suppl_1, pp. i47–i56, 2005.
- J. Tang, J. Zhang, L. Yao, J. Li, L. Zhang, and Z. Su, “Arnetminer: extraction and mining of academic social networks,” in Proceedings of the 14th ACM SIGKDD international conference on Knowledge discovery and data mining, 2008, pp. 990–998.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” arXiv preprint arXiv:1609.02907, 2016.
- P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” arXiv preprint arXiv:1710.10903, 2017.
- F. Wu, A. Souza, T. Zhang, C. Fifty, T. Yu, and K. Weinberger, “Simplifying graph convolutional networks,” in ICML, 2019.
- W.-L. Chiang, X. Liu, S. Si, Y. Li, S. Bengio, and C.-J. Hsieh, “Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks,” in Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining, 2019, pp. 257–266.
- M. Liu, H. Gao, and S. Ji, “Towards deeper graph neural networks,” in Proceedings of the 26th ACM SIGKDD international conference on knowledge discovery & data mining, 2020, pp. 338–348.
- J. Klicpera, A. Bojchevski, and S. Günnemann, “Predict then propagate: Graph neural networks meet personalized pagerank,” arXiv preprint arXiv:1810.05997, 2018.
- V. P. Dwivedi, A. T. Luu, T. Laurent, Y. Bengio, and X. Bresson, “Graph neural networks with learnable structural and positional representations,” arXiv preprint arXiv:2110.07875, 2021.
- T. Chen, S. Bian, and Y. Sun, “Are powerful graph neural nets necessary? a dissection on graph classification,” arXiv preprint arXiv:1905.04579, 2019.
- K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” arXiv preprint arXiv:1810.00826, 2018.
- W. L. Hamilton, R. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” in NeurIPS, 2017, pp. 1025–1035.
- Q. Tan, X. Zhang, X. Huang, H. Chen, J. Li, and X. Hu, “Collaborative graph neural networks for attributed network embedding,” IEEE Transactions on Knowledge and Data Engineering, 2023.
- J. Gilmer, S. S. Schoenholz, P. F. Riley, O. Vinyals, and G. E. Dahl, “Neural message passing for quantum chemistry,” in International conference on machine learning. PMLR, 2017, pp. 1263–1272.
- D. K. Hammond, P. Vandergheynst, and R. Gribonval, “Wavelets on graphs via spectral graph theory,” Applied and Computational Harmonic Analysis, vol. 30, no. 2, pp. 129–150, 2011.
- A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin, “Attention is all you need,” Advances in neural information processing systems, vol. 30, 2017.
- L. Page, S. Brin, R. Motwani, and T. Winograd, “The pagerank citation ranking: Bringing order to the web.” Stanford InfoLab, Tech. Rep., 1999.
- P. Sen, G. Namata, M. Bilgic, L. Getoor, B. Galligher, and T. Eliassi-Rad, “Collective classification in network data,” AI magazine, vol. 29, no. 3, pp. 93–93, 2008.
- W. Hu, M. Fey, H. Ren, M. Nakata, Y. Dong, and J. Leskovec, “Ogb-lsc: A large-scale challenge for machine learning on graphs,” arXiv preprint arXiv:2103.09430, 2021.
- T. G. Dietterich et al., “Ensemble learning,” The handbook of brain theory and neural networks, vol. 2, no. 1, pp. 110–125, 2002.
- L. Breiman, “Random forests,” Machine learning, vol. 45, no. 1, pp. 5–32, 2001.
- T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, K. Chen et al., “Xgboost: extreme gradient boosting,” R package version 0.4-2, vol. 1, no. 4, pp. 1–4, 2015.
- G. Ridgeway, “Generalized boosted models: A guide to the gbm package,” Update, vol. 1, no. 1, p. 2007, 2007.
- Y. Yang, H. Lv, and N. Chen, “A survey on ensemble learning under the era of deep learning,” arXiv preprint arXiv:2101.08387, 2021.
- S. S. Dhaliwal, A.-A. Nahid, and R. Abbas, “Effective intrusion detection system using xgboost,” Information, vol. 9, no. 7, p. 149, 2018.
- L. Wang, X. Wang, A. Chen, X. Jin, and H. Che, “Prediction of type 2 diabetes risk and its effect evaluation based on the xgboost model,” in Healthcare, vol. 8, no. 3. MDPI, 2020, p. 247.
- M. Chen, Z. Wei, Z. Huang, B. Ding, and Y. Li, “Simple and deep graph convolutional networks,” in International Conference on Machine Learning. PMLR, 2020, pp. 1725–1735.
- W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, “Open graph benchmark: Datasets for machine learning on graphs,” Advances in neural information processing systems, vol. 33, pp. 22 118–22 133, 2020.
- S. Zhang, Y. Liu, Y. Sun, and N. Shah, “Graph-less neural networks: Teaching old mlps new tricks via distillation,” arXiv preprint arXiv:2110.08727, 2021.
- D. Crankshaw, X. Wang, G. Zhou, M. J. Franklin, J. E. Gonzalez, and I. Stoica, “Clipper: A {{\{{Low-Latency}}\}} online prediction serving system,” in 14th USENIX Symposium on Networked Systems Design and Implementation (NSDI 17), 2017, pp. 613–627.
- S. P. Baller, A. Jindal, M. Chadha, and M. Gerndt, “Deepedgebench: Benchmarking deep neural networks on edge devices,” in 2021 IEEE International Conference on Cloud Engineering (IC2E). IEEE, 2021, pp. 20–30.
- V. Garcia and J. Bruna, “Few-shot learning with graph neural networks,” arXiv preprint arXiv:1711.04043, 2017.
- F. Chen, Y.-C. Wang, B. Wang, and C.-C. J. Kuo, “Graph representation learning: A survey,” APSIPA Transactions on Signal and Information Processing, vol. 9, 2020.
- Z. Wu, S. Pan, F. Chen, G. Long, C. Zhang, and S. Y. Philip, “A comprehensive survey on graph neural networks,” IEEE transactions on neural networks and learning systems, vol. 32, no. 1, pp. 4–24, 2020.
- S. Xiao, S. Wang, Y. Dai, and W. Guo, “Graph neural networks in node classification: survey and evaluation,” Machine Vision and Applications, vol. 33, no. 1, pp. 1–19, 2022.
- E. E. Kosasih, J. Cabezas, X. Sumba, P. Bielak, K. Tagowski, K. Idanwekhai, B. A. Tjandra, and A. R. Jamasb, “On graph neural network ensembles for large-scale molecular property prediction,” arXiv preprint arXiv:2106.15529, 2021.
- A. Nagarajan, J. R. Stevens, and A. Raghunathan, “Efficient ensembles of graph neural networks,” in Proceedings of the 59th ACM/IEEE Design Automation Conference, 2022, pp. 187–192.
- Z. Allen-Zhu and Y. Li, “Towards understanding ensemble, knowledge distillation and self-distillation in deep learning,” arXiv preprint arXiv:2012.09816, 2020.
- Y. Yang, J. Qiu, M. Song, D. Tao, and X. Wang, “Distilling knowledge from graph convolutional networks,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 7074–7083.
- B. Yan, C. Wang, G. Guo, and Y. Lou, “Tinygnn: Learning efficient graph neural networks,” in Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2020, pp. 1848–1856.
- X. Deng and Z. Zhang, “Graph-free knowledge distillation for graph neural networks,” arXiv preprint arXiv:2105.07519, 2021.
- Y. Xu, Y. Zhang, W. Guo, H. Guo, R. Tang, and M. Coates, “Graphsail: Graph structure aware incremental learning for recommender systems,” in Proceedings of the 29th ACM International Conference on Information & Knowledge Management, 2020, pp. 2861–2868.
- Q. Tan, D. Zha, S.-H. Choi, L. Li, R. Chen, and X. Hu, “Double wins: Boosting accuracy and efficiency of graph neural networks by reliable knowledge distillation,” 2022.
- G. Hinton, O. Vinyals, J. Dean et al., “Distilling the knowledge in a neural network,” arXiv preprint arXiv:1503.02531, vol. 2, no. 7, 2015.
- K. Feng, C. Li, Y. Yuan, and G. Wang, “Freekd: Free-direction knowledge distillation for graph neural networks,” in Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2022, pp. 357–366.
- C. Yang, J. Liu, and C. Shi, “Extract the knowledge of graph neural networks and go beyond it: An effective knowledge distillation framework,” in Proceedings of the Web Conference 2021, 2021, pp. 1227–1237.
- R. J. Williams, “Simple statistical gradient-following algorithms for connectionist reinforcement learning,” Machine learning, vol. 8, no. 3-4, pp. 229–256, 1992.
- H. Liu, K. Simonyan, and Y. Yang, “Darts: Differentiable architecture search,” in International Conference on Learning Representations, 2018.
- Y. You, T. Chen, Y. Shen, and Z. Wang, “Graph contrastive learning automated,” in International Conference on Machine Learning. PMLR, 2021, pp. 12 121–12 132.
- U. Asif, J. Tang, and S. Harrer, “Ensemble knowledge distillation for learning improved and efficient networks,” arXiv preprint arXiv:1909.08097, 2019.
- X. Zhu, S. Gong et al., “Knowledge distillation by on-the-fly native ensemble,” Advances in neural information processing systems, vol. 31, 2018.
- F. Yuan, L. Shou, J. Pei, W. Lin, M. Gong, Y. Fu, and D. Jiang, “Reinforced multi-teacher selection for knowledge distillation,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 35, no. 16, 2021, pp. 14 284–14 291.
- A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” Communications of the ACM, vol. 60, no. 6, pp. 84–90, 2017.
- J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “Bert: Pre-training of deep bidirectional transformers for language understanding,” arXiv preprint arXiv:1810.04805, 2018.
- J. McAuley, C. Targett, Q. Shi, and A. Van Den Hengel, “Image-based recommendations on styles and substitutes,” in Proceedings of the 38th international ACM SIGIR conference on research and development in information retrieval, 2015, pp. 43–52.
- A. Sinha, Z. Shen, Y. Song, H. Ma, D. Eide, B.-J. Hsu, and K. Wang, “An overview of microsoft academic service (mas) and applications,” in Proceedings of the 24th international conference on world wide web, 2015, pp. 243–246.
- P. Mernyei and C. Cangea, “Wiki-cs: A wikipedia-based benchmark for graph neural networks,” arXiv preprint arXiv:2007.02901, 2020.
- J. Zhou, G. Cui, S. Hu, Z. Zhang, C. Yang, Z. Liu, L. Wang, C. Li, and M. Sun, “Graph neural networks: A review of methods and applications,” AI Open, vol. 1, pp. 57–81, 2020.
- S. Wu, F. Sun, W. Zhang, X. Xie, and B. Cui, “Graph neural networks in recommender systems: a survey,” ACM Computing Surveys (CSUR), 2020.
- M. Fey and J. E. Lenssen, “Fast graph representation learning with pytorch geometric,” arXiv preprint arXiv:1903.02428, 2019.