A Poincaré Inequality and Consistency Results for Signal Sampling on Large Graphs (2311.10610v3)
Abstract: Large-scale graph machine learning is challenging as the complexity of learning models scales with the graph size. Subsampling the graph is a viable alternative, but sampling on graphs is nontrivial as graphs are non-Euclidean. Existing graph sampling techniques require not only computing the spectra of large matrices but also repeating these computations when the graph changes, e.g., grows. In this paper, we introduce a signal sampling theory for a type of graph limit -- the graphon. We prove a Poincar\'e inequality for graphon signals and show that complements of node subsets satisfying this inequality are unique sampling sets for Paley-Wiener spaces of graphon signals. Exploiting connections with spectral clustering and Gaussian elimination, we prove that such sampling sets are consistent in the sense that unique sampling sets on a convergent graph sequence converge to unique sampling sets on the graphon. We then propose a related graphon signal sampling algorithm for large graphs, and demonstrate its good empirical performance on graph machine learning tasks.
- David J. Aldous. Representations for partially exchangeable arrays of random variables. Journal of Multivariate Analysis, 11(4):581–598, 1981. ISSN 0047-259X. doi: https://doi.org/10.1016/0047-259X(81)90099-3. URL https://www.sciencedirect.com/science/article/pii/0047259X81900993.
- Efficient sampling set selection for bandlimited graph signals using graph spectral proxies. IEEE Trans. Signal Process., 64(14):3775–3789, 2016.
- H. Avron and C. Boutsidis. Faster subset selection for matrices and applications. SIAM Journal on Matrix Analysis and Applications, 34(4):1464–1499, 2013.
- Scale-free characteristics of random networks: the topology of the world-wide web. Physica A: Statistical Mechanics and its Applications, 281(1):69–77, 2000. URL https://www.sciencedirect.com/science/article/pii/S0378437100000182.
- C. Borgs and J. Chayes. Graphons: A nonparametric method to model, estimate, and design algorithms for massive networks. In Proceedings of the 2017 ACM Conference on Economics and Computation, pp. 665–672, 2017.
- Convergent sequences of dense graphs I: Subgraph frequencies, metric properties and testing. Adv. Math., 219(6):1801–1851, 2008.
- Private graphon estimation for sparse graphs. Neural Inform. Process. Syst., 28, 2015.
- Learning by transference: Training graph neural networks on growing graphs. IEEE Trans. Signal Process., 2023.
- L. F. O. Chamon and A. Ribeiro. Greedy sampling of graph signals. IEEE Trans. Signal Process., 66:34–47, 2017.
- Discrete signal processing on graphs: Sampling theory. IEEE Trans. Signal Process., 63:6510–6523, 2015.
- F. Chung and O. Simpson. Computing heat kernel pagerank and a local clustering algorithm. European Journal of Combinatorics, 68:96–119, 2018.
- Graph neural networks with learnable structural and positional representations. arXiv:2110.07875 [cs.LG], 2021.
- Graphons, mergeons, and so on! Neural Inform. Process. Syst., 29, 2016.
- MalNet: A large-scale image database of malicious software. arXiv:2102.01072 [cs.LG], 2021.
- Douglas N Hoover. Relations on probability spaces and arrays of. t, Institute for Advanced Study, 1979.
- Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873):583–589, 2021. URL https://doi.org/10.1038/s41586-021-03819-2.
- Maximizing the spread of influence through a social network. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), pp. 137–146. Association for Computing Machinery, 2003.
- J. M. Kleinberg. The small-world phenomenon: An algorithmic perspective. In Symposium on Theory of Computing (STOC), 2000. URL https://api.semanticscholar.org/CorpusID:221559836.
- S. Krishnagopal and L. Ruiz. Graph neural tangent kernel: Convergence on large graphs. Int. Conf. Mach. Learning, 202:1–15, 2023.
- T. Le and S. Jegelka. Limits, approximation and size transferability for gnns on sparse graphs via graphops. arXiv:2306.04495 [cs.LG], 2023.
- Polynomial time algorithms for dual volume sampling. Neural Inform. Process. Syst., 30, 2017.
- Sign and basis invariant networks for spectral graph representation learning. arXiv:2202.13013 [cs.LG], 2022.
- L. Lovász. Large Networks and Graph Limits, volume 60. American Mathematical Society, 2012.
- A statistical perspective on algorithmic leveraging. In Int. Conference on Machine Learning (ICML), pp. 91–99. PMLR, 2014.
- Sampling of graph signals with successive local aggregations. IEEE Trans. Signal Process., 64:1832–1843, 2015.
- Generalization analysis of message passing neural networks on large random graphs. Neural Inform. Process. Syst., 35:4805–4817, 2022.
- Graph signal processing: Overview, challenges, and applications. Proc. IEEE, 106(5):808–828, 2018.
- Sampling and uniqueness sets in graphon signal processing, 2024.
- I. Pesenson. Sampling in Paley-Wiener spaces on combinatorial graphs. Transactions of the American Mathematical Society, 360(10):5603–5627, 2008.
- F. Pukelsheim. Optimal design of experiments. SIAM, 2006.
- On fast leverage score sampling and optimal learning. Neural Inform. Process. Syst., 31, 2018.
- Graphon neural networks and the transferability of graph neural networks. In 34th Neural Inform. Process. Syst., Vancouver, BC (Virtual), 6-12 Dec. 2020a. NeurIPS Foundation.
- The Graphon Fourier Transform. In 45th IEEE Int. Conf. Acoust., Speech and Signal Process., pp. 5660–5664, Barcelona, Spain (Virtual), 4-8 May 2020b. IEEE.
- Graphon signal processing. IEEE Trans. Signal Process., 69:4961–4976, 2021.
- A. Sandryhaila and J. M. F. Moura. Discrete signal processing on graphs: Frequency analysis. IEEE Trans. Signal Process., 62:3042–3054, June 2014.
- The geometry of kernelized spectral clustering. The Annals of Statistics, 43(2), Apr. 2015. URL https://doi.org/10.1214%2F14-aos1283.
- The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains. IEEE Signal Process. Mag., 30(3):83–98, May 2013.
- D. Spielman and N. Srivastava. Graph sparsification by effective resistances. In Proceedings of the 40th Annual ACM Symposium on Theory of Computing, pp. 563–568, 2008.
- L. Takac and M. Zábovský. Data analysis in public social networks. International Scientific Conference and International Workshop Present Day Trends of Innovations, pp. 1–6, Jan. 2012.
- Revisiting semi-supervised learning with graph embeddings. In Int. Conf. Mach. Learning, pp. 40–48. PMLR, 2016.
- Graph convolutional neural networks for web-scale recommender systems. In ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD), KDD ’18, pp. 974–983. Association for Computing Machinery, 2018. URL https://doi.org/10.1145/3219819.3219890.
- Modeling polypharmacy side effects with graph convolutional networks. Bioinformatics, 34(13):i457–i466, 06 2018. URL https://doi.org/10.1093/bioinformatics/bty294.