Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

BNS-GCN: Efficient Full-Graph Training of Graph Convolutional Networks with Partition-Parallelism and Random Boundary Node Sampling (2203.10983v2)

Published 21 Mar 2022 in cs.LG and cs.AI

Abstract: Graph Convolutional Networks (GCNs) have emerged as the state-of-the-art method for graph-based learning tasks. However, training GCNs at scale is still challenging, hindering both the exploration of more sophisticated GCN architectures and their applications to real-world large graphs. While it might be natural to consider graph partition and distributed training for tackling this challenge, this direction has only been slightly scratched the surface in the previous works due to the limitations of existing designs. In this work, we first analyze why distributed GCN training is ineffective and identify the underlying cause to be the excessive number of boundary nodes of each partitioned subgraph, which easily explodes the memory and communication costs for GCN training. Furthermore, we propose a simple yet effective method dubbed BNS-GCN that adopts random Boundary-Node-Sampling to enable efficient and scalable distributed GCN training. Experiments and ablation studies consistently validate the effectiveness of BNS-GCN, e.g., boosting the throughput by up to 16.2x and reducing the memory usage by up to 58%, while maintaining a full-graph accuracy. Furthermore, both theoretical and empirical analysis show that BNS-GCN enjoys a better convergence than existing sampling-based methods. We believe that our BNS-GCN has opened up a new paradigm for enabling GCN training at scale. The code is available at https://github.com/RICE-EIC/BNS-GCN.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Cheng Wan (48 papers)
  2. Youjie Li (5 papers)
  3. Ang Li (472 papers)
  4. Nam Sung Kim (30 papers)
  5. Yingyan Lin (67 papers)
Citations (67)

Summary

Efficient Full-Graph Training of Graph Convolutional Networks Using BNS-GCN

The paper presents a method named BNS-GCN (Boundary Node Sampling for GCN), which addresses the scalability challenges in training Graph Convolutional Networks (GCNs) on large graphs using distributed systems. Graph Convolutional Networks are pivotal in processing data structured as graphs and are extensively used in tasks such as node classification, link prediction, and recommendation systems. Despite their efficacy, scaling GCNs to work with large graphs remains challenging due to significant memory and communication overheads.

BNS-GCN introduces an efficient technique that combines partition parallelism with random boundary node sampling to mitigate these overheads. Here is a structured overview of the paper's contributions and findings:

  1. Challenges Identified: The paper begins with an analysis of the inefficiency in distributed GCN training, pinpointing the primary causes as the excessive number of boundary nodes in partitioned subgraphs. These nodes necessitate substantial memory usage and communication overhead, hindering training scalability.
  2. Proposed Methodology: To address these challenges, BNS-GCN selectively samples boundary nodes during training iterations, reducing the number of nodes involved in computation and communication. This sampling is random and varies per epoch, thereby maintaining training flexibility while significantly cutting down on resource usage.
  3. Empirical and Theoretical Evidence: The paper demonstrates both theoretically and empirically that BNS-GCN offers improved convergence properties over other sampling methods, like GraphSAGE and VR-GCN. Through experiments, BNS-GCN achieves up to 16.2 times higher throughput compared to state-of-the-art methods and reduces memory usage by up to 58%, all while maintaining full-graph accuracy.
  4. Performance Improvements: The efficiency of BNS-GCN scales well with the number of partitions and graph size, evidenced by the significant memory usage balance across partitions. Additionally, it reduces communication time, which is typically a major bottleneck in distributed training settings.
  5. Applicability Across Models and Scenarios: BNS-GCN is versatile and can be plugged into existing frameworks or configured with alternative graph partitioning methods beyond METIS. It also shows robustness across different datasets and settings, including multi-machine environments, highlighting its potential for broader deployment.
  6. Future Directions: The outcome of this research paves the way for further exploration into distributed GCN training optimization, promising improved scalability and resource efficiency. Future research could extend BNS-GCN's principles to other graph learning scenarios, such as dynamic graphs or graph-based reinforcement learning settings.

In conclusion, BNS-GCN constitutes a significant advance in the domain of scalable graph neural network training. By prioritizing the reduction of boundary node overhead, it offers a pragmatic solution to the scalability barrier faced by GCNs when applied to increasingly large and complex graph datasets.