Stability and Generalization of lp-Regularized Stochastic Learning for GCN (2305.12085v3)
Abstract: Graph convolutional networks (GCN) are viewed as one of the most popular representations among the variants of graph neural networks over graph data and have shown powerful performance in empirical experiments. That $\ell_2$-based graph smoothing enforces the global smoothness of GCN, while (soft) $\ell_1$-based sparse graph learning tends to promote signal sparsity to trade for discontinuity. This paper aims to quantify the trade-off of GCN between smoothness and sparsity, with the help of a general $\ell_p$-regularized $(1<p\leq 2)$ stochastic learning proposed within. While stability-based generalization analyses have been given in prior work for a second derivative objectiveness function, our $\ell_p$-regularized learning scheme does not satisfy such a smooth condition. To tackle this issue, we propose a novel SGD proximal algorithm for GCNs with an inexact operator. For a single-layer GCN, we establish an explicit theoretical understanding of GCN with the $\ell_p$-regularized stochastic learning by analyzing the stability of our SGD proximal algorithm. We conduct multiple empirical experiments to validate our theoretical findings.
- Measuring abstract reasoning in neural networks. In International Conference on Machine Learning, pages 511–520. PMLR, 2018.
- Interaction networks for learning about objects, relations and physics. In Advances in Neural Information Processing Systems, pages 4502–4510, 2016.
- The tradeoffs of large scale learning. In Advances in Neural Information Processing Systems, pages 161–168, 2007.
- Stability and generalization. The Journal of Machine Learning Research, 2:499–526, 2002.
- Spectral networks and locally connected networks on graphs. arXiv preprint arXiv:1312.6203, 2013.
- Convolutional neural networks on graphs with fast localized spectral filtering. In Advances in neural information processing systems, volume 29, pages 3844–3852, 2016.
- Convolutional networks on graphs for learning molecular fingerprints. In Advances in Neural Information Processing Systems, volume 28, pages 2224–2232, 2015.
- Generalization and representational limits of graph neural networks. In International Conference on Machine Learning, pages 3419–3430. PMLR, 2020.
- Neural message passing for quantum chemistry. In International Conference on Machine Learning, pages 1263–1272. PMLR, 2017.
- Inductive representation learning on large graphs. In Proceedings of the 31st International Conference on Neural Information Processing Systems, pages 1025–1035, 2017.
- Train faster, generalize better: Stability of stochastic gradient descent. In International Conference on Machine Learning, pages 1225–1234. PMLR, 2016.
- Junction tree variational autoencoder for molecular graph generation. In International Conference on Machine Learning, pages 2323–2332. PMLR, 2018.
- Algorithmic stability and sanity-check bounds for leave-one-out cross-validation. Neural Computation, 11(6):1427–1453, 1999.
- Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907, 2016.
- Variational graph auto-encoders. arXiv preprint arXiv:1611.07308, 2016.
- Vladimir Koltchinskii. Sparsity in penalized empirical risk minimization. In Annales de l’IHP Probabilités et statistiques, volume 45, pages 7–57, 2009.
- A pac-bayesian approach to generalization bounds for graph neural networks. In International Conference on Learning Representations, 2020.
- Elastic graph neural networks. In International Conference on Machine Learning, pages 6837–6849. PMLR, 2021.
- A unified view on graph neural networks as graph signal denoising. In Proceedings of the 30th ACM International Conference on Information & Knowledge Management, pages 1202–1211, 2021.
- Unsupervised and semi-supervised learning via 𝓁𝓁\mathscr{l}script_l 1-norm graph. In 2011 International Conference on Computer Vision, pages 2268–2273. IEEE, 2011.
- Optimization and generalization analysis of transduction through gradient boosting and application to multi-scale graph neural networks. Advances in Neural Information Processing Systems, 33:18917–18930, 2020.
- Modeling relational data with graph convolutional networks. In European semantic web conference, pages 593–607. Springer, 2018.
- Collective classification in network data. AI magazine, 29(3):93–93, 2008.
- Learnability, stability and uniform convergence. The Journal of Machine Learning Research, 11:2635–2670, 2010.
- Ryan J Tibshirani. Adaptive piecewise polynomial estimation via trend filtering. The Annals of Statistics, 42(1):285–323, 2014.
- Graph attention networks. arXiv preprint arXiv:1710.10903, 2017.
- Stability and generalization of graph convolutional neural networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 1539–1548, 2019.
- Trend filtering on graphs. In Artificial Intelligence and Statistics, pages 1042–1050. PMLR, 2015.
- Sufficient conditions for uniform stability of regularization algorithms. Computer Science and Artificial Intelligence Laboratory Technical Report, MIT-CSAIL-TR-2009-060, 2009.
- Sparse algorithms are not stable: A no-free-lunch theorem. IEEE transactions on Pattern Analysis and Machine Intelligence, 34(1):187–193, 2011.
- How powerful are graph neural networks? arXiv preprint arXiv:1810.00826, 2018.
- Graph transformer networks. Advances in Neural Information Processing Systems, 32:11983–11993, 2019.
- Efficient probabilistic logic reasoning with graph neural networks. arXiv preprint arXiv:2001.11850, 2020.