Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Two Heads Are Better Than One: Boosting Graph Sparse Training via Semantic and Topological Awareness (2402.01242v1)

Published 2 Feb 2024 in cs.LG

Abstract: Graph Neural Networks (GNNs) excel in various graph learning tasks but face computational challenges when applied to large-scale graphs. A promising solution is to remove non-essential edges to reduce the computational overheads in GNN. Previous literature generally falls into two categories: topology-guided and semantic-guided. The former maintains certain graph topological properties yet often underperforms on GNNs due to low integration with neural network training. The latter performs well at lower sparsity on GNNs but faces performance collapse at higher sparsity levels. With this in mind, we take the first step to propose a new research line and concept termed Graph Sparse Training (GST), which dynamically manipulates sparsity at the data level. Specifically, GST initially constructs a topology & semantic anchor at a low training cost, followed by performing dynamic sparse training to align the sparse graph with the anchor. We introduce the Equilibria Sparsification Principle to guide this process, effectively balancing the preservation of both topological and semantic information. Ultimately, GST produces a sparse graph with maximum topological integrity and no performance degradation. Extensive experiments on 6 datasets and 5 backbones showcase that GST (I) identifies subgraphs at higher graph sparsity levels (1.67%~15.85% $\uparrow$) than state-of-the-art sparsification methods, (II) preserves more key spectral properties, (III) achieves 1.27-3.42$\times$ speedup in GNN inference and (IV) successfully helps graph adversarial defense and graph lottery tickets.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (69)
  1. Critical learning periods in deep networks. In International Conference on Learning Representations, 2018.
  2. Spectral sparsification of graphs: theory and algorithms. Communications of the ACM, 56(8):87–94, 2013.
  3. Approximating st minimum cuts in õ (n 2) time. In Proceedings of the twenty-eighth annual ACM symposium on Theory of computing, pp.  47–55, 1996.
  4. Sublinear time eigenvalue approximation via random sampling. arXiv preprint arXiv:2109.07647, 2021.
  5. Optimizing network robustness by edge rewiring: a general framework. Data Mining and Knowledge Discovery, 30:1395–1425, 2016.
  6. Stochastic training of graph convolutional networks with variance reduction. arXiv preprint arXiv:1710.10568, 2017.
  7. Fastgcn: fast learning with graph convolutional networks via importance sampling. In Proceedings of ICLR, 2018.
  8. A unified lottery ticket hypothesis for graph neural networks. In International Conference on Machine Learning, pp.  1695–1706. PMLR, 2021.
  9. Demystifying graph sparsification algorithms in graph properties preservation. arXiv preprint arXiv:2311.12314, 2023.
  10. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pp.  257–266, 2019. doi: 10.1145/3292500.3330925.
  11. Characterization of complex networks: A survey of measurements. Advances in physics, 56(1):167–242, 2007.
  12. Mgnn: Graph neural networks inspired by distance geometry problem. In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pp.  335–347, 2023.
  13. David, J. Algorithms for analysis and design of robust controllers. 1995.
  14. Sparse networks from scratch: Faster training without losing performance. arXiv preprint arXiv:1907.04840, 2019.
  15. Provable and practical approximations for the degree distribution using sublinear graph samples. In Proceedings of the 2018 World Wide Web Conference, pp.  449–458, 2018.
  16. Rigging the lottery: Making all tickets winners. In International Conference on Machine Learning, pp.  2943–2952. PMLR, 2020.
  17. Fast graph representation learning with pytorch geometric. arXiv preprint arXiv:1903.02428, 2019.
  18. Forman. Bochner’s method for cell complexes and combinatorial ricci curvature. Discrete & Computational Geometry, 29:323–374, 2003.
  19. Pruning neural networks at initialization: Why are we missing the mark? arXiv preprint arXiv:2009.08576, 2020.
  20. Structure-preserving sparsification methods for social networks. Social Network Analysis and Mining, 6:1–22, 2016.
  21. Inductive representation learning on large graphs. In Proceedings of NIPS, 2017.
  22. Revisiting pruning at initialization through the lens of ramanujan graph. In International Conference on Learning Representations. International Conference on Learning Representations (ICLR) 2023, 2023.
  23. Open graph benchmark: Datasets for machine learning on graphs. arXiv preprint arXiv:2005.00687, 2020.
  24. Dynamic sparse training via balancing the exploration-exploitation trade-off. In 2023 60th ACM/IEEE Design Automation Conference (DAC), pp.  1–6. IEEE, 2023.
  25. Rethinking graph lottery tickets: Graph sparsity matters. In The Eleventh International Conference on Learning Representations, 2023.
  26. Graph condensation for graph neural networks. arXiv preprint arXiv:2110.07580, 2021.
  27. Fosr: First-order spectral rewiring for addressing oversquashing in gnns. In The Eleventh International Conference on Learning Representations, 2023.
  28. Semi-Supervised Classification with Graph Convolutional Networks. In Proceedings of the 5th International Conference on Learning Representations, 2017a.
  29. Semi-supervised classification with graph convolutional networks. In Proceedings of ICLR, 2017b.
  30. Snip: Single-shot network pruning based on connection sensitivity. arXiv preprint arXiv:1810.02340, 2018.
  31. Dynamic sparse training: Find efficient sparse network from scratch with trainable masked layers. arXiv preprint arXiv:2005.06870, 2020.
  32. Do we actually need dense over-parameterization? in-time over-parameterization in sparse training. In International Conference on Machine Learning, pp.  6989–7000. PMLR, 2021.
  33. Dspar: An embarrassingly simple strategy for efficient gnn training and inference via degree-based sparsification. Transactions on Machine Learning Research, 2023.
  34. Learning to drop: Robust graph neural network via topological denoising. In Proceedings of the 14th ACM international conference on web search and data mining, pp.  779–787, 2021.
  35. Sanity checks for lottery tickets: Does your winning ticket really win the jackpot? Advances in Neural Information Processing Systems, 34:12749–12760, 2021.
  36. Scalable training of artificial neural networks with adaptive sparse connectivity inspired by network science. Nature communications, 9(1):2383, 2018.
  37. Optimizing sparse matrix multiplications for graph neural networks. In International Workshop on Languages and Compilers for Parallel Computing, pp.  101–117. Springer, 2021.
  38. Dropedge: Towards deep graph convolutional networks on node classification. arXiv preprint arXiv:1907.10903, 2019.
  39. Multi-scale attributed node embedding. Journal of Complex Networks, 9(2):cnab014, 2021.
  40. Local graph sparsification for scalable clustering. In Proceedings of the 2011 ACM SIGMOD International Conference on Management of data, pp.  721–732, 2011.
  41. Graph sparsification by effective resistances. In Proceedings of the fortieth annual ACM symposium on Theory of computing, pp.  563–568, 2008.
  42. Graph structure learning with variational information bottleneck. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  4165–4174, 2022.
  43. The graph lottery ticket hypothesis: Finding sparse, informative graph structure. arXiv preprint arXiv:2312.04762, 2023.
  44. Graph attention networks. stat, 1050:20, 2017.
  45. Graph attention networks. In International Conference on Learning Representations, 2018.
  46. Graph sparsification via meta-learning. DLG@ AAAI, 2021.
  47. Edge sparsification for graphs via meta-learning. In 2021 IEEE 37th International Conference on Data Engineering (ICDE), pp.  2733–2738. IEEE, 2021.
  48. Searching lottery tickets in graph neural networks: A dual perspective. In The Eleventh International Conference on Learning Representations, 2022.
  49. Brave the wind and the waves: Discovering robust and generalizable graph lottery tickets. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023a.
  50. Searching lottery tickets in graph neural networks: A dual perspective. In The Eleventh International Conference on Learning Representations, 2023b. URL https://openreview.net/forum?id=Dvs-a3aymPe.
  51. Deep graph library: A graph-centric, highly-performant package for graph neural networks. arXiv preprint arXiv:1909.01315, 2019.
  52. Adversarial erasing with pruned elements: Towards better graph lottery ticket. arXiv preprint arXiv:2308.02916, 2023c.
  53. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24, 2020.
  54. How powerful are graph neural networks? In International Conference on Learning Representations, 2019.
  55. Scan: a structural clustering algorithm for networks. In Proceedings of the 13th ACM SIGKDD international conference on Knowledge discovery and data mining, pp.  824–833, 2007.
  56. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of KDD, 2018a.
  57. Hierarchical graph representation learning with differentiable pooling. Advances in neural information processing systems, 31, 2018b.
  58. Drawing early-bird tickets: Towards more efficient training of deep networks. arXiv preprint arXiv:1909.11957, 2019.
  59. Early-bird gcns: Graph-network co-optimization towards more efficient gcn training and inference via drawing early-bird lottery tickets. In Proceedings of the AAAI Conference on Artificial Intelligence, volume 36, pp.  8910–8918, 2022.
  60. Mest: Accurate and fast memory-economic sparse training framework on the edge. Advances in Neural Information Processing Systems, 34:20838–20850, 2021.
  61. Graph lottery ticket automated. In The International Conference on Learning Representations, 2024.
  62. Prone: Fast and scalable network representation learning. In IJCAI, volume 19, pp.  4278–4284, 2019.
  63. Link prediction based on graph neural networks. In Proceedings of NIPS, 2018.
  64. Inductive matrix completion based on graph neural networks. arXiv preprint arXiv:1904.12058, 2019.
  65. An end-to-end deep learning architecture for graph classification. In Proceedings of the AAAI conference on artificial intelligence, volume 32, 2018.
  66. Dynamic sparse no training: Training-free fine-tuning for sparse llms. arXiv preprint arXiv:2310.08915, 2023.
  67. Robust graph representation learning via neural sparsification. In International Conference on Machine Learning, pp.  11458–11468. PMLR, 2020.
  68. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020.
  69. To prune, or not to prune: exploring the efficacy of pruning for model compression. arXiv preprint arXiv:1710.01878, 2017.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (10)
  1. Guibin Zhang (29 papers)
  2. Yanwei Yue (7 papers)
  3. Kun Wang (355 papers)
  4. Junfeng Fang (45 papers)
  5. Yongduo Sui (14 papers)
  6. Kai Wang (624 papers)
  7. Yuxuan Liang (126 papers)
  8. Dawei Cheng (38 papers)
  9. Shirui Pan (198 papers)
  10. Tianlong Chen (202 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.

X Twitter Logo Streamline Icon: https://streamlinehq.com

Tweets