Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks (1905.07953v2)

Published 20 May 2019 in cs.LG, cs.AI, and stat.ML

Abstract: Graph convolutional network (GCN) has been successfully applied to many graph-based applications; however, training a large-scale GCN remains challenging. Current SGD-based algorithms suffer from either a high computational cost that exponentially grows with number of GCN layers, or a large space requirement for keeping the entire graph and the embedding of each node in memory. In this paper, we propose Cluster-GCN, a novel GCN algorithm that is suitable for SGD-based training by exploiting the graph clustering structure. Cluster-GCN works as the following: at each step, it samples a block of nodes that associate with a dense subgraph identified by a graph clustering algorithm, and restricts the neighborhood search within this subgraph. This simple but effective strategy leads to significantly improved memory and computational efficiency while being able to achieve comparable test accuracy with previous algorithms. To test the scalability of our algorithm, we create a new Amazon2M data with 2 million nodes and 61 million edges which is more than 5 times larger than the previous largest publicly available dataset (Reddit). For training a 3-layer GCN on this data, Cluster-GCN is faster than the previous state-of-the-art VR-GCN (1523 seconds vs 1961 seconds) and using much less memory (2.2GB vs 11.2GB). Furthermore, for training 4 layer GCN on this data, our algorithm can finish in around 36 minutes while all the existing GCN training algorithms fail to train due to the out-of-memory issue. Furthermore, Cluster-GCN allows us to train much deeper GCN without much time and memory overhead, which leads to improved prediction accuracy---using a 5-layer Cluster-GCN, we achieve state-of-the-art test F1 score 99.36 on the PPI dataset, while the previous best result was 98.71 by [16]. Our codes are publicly available at https://github.com/google-research/google-research/tree/master/cluster_gcn.

PDF Abstract

An Efficient Algorithm for Training Large-Scale Graph Convolutional Networks

"Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks," authored by Wei-Lin Chiang et al., presents a significant advancement in the domain of Graph Convolutional Networks (GCNs). The authors acknowledge the computationally intensive nature of traditional GCN training methods and propose an optimized algorithm, Cluster-GCN, that significantly reduces both memory usage and training time.

Summary of Key Contributions

The paper identifies and addresses two prominent challenges in training large-scale GCNs:

Computational Cost: Traditional full-batch training methods require memory-intensive operations, which scale poorly with the number of GCN layers.
Scalability: The authors critique existing SGD-based mini-batch algorithms, highlighting their inefficiencies due to the exponential neighborhood expansion problem.

To overcome these challenges, Cluster-GCN employs a graph clustering approach to optimize the sampling of nodes during training. This is achieved by partitioning the graph into dense subgraphs and restricting the training samples to these subgraphs, significantly improving training efficiency.

Algorithmic Innovations

Cluster-GCN introduces a novel strategy where the underlying graph is partitioned using efficient clustering algorithms such as METIS. In each training iteration, a dense subgraph (or cluster) is sampled, limiting the neighborhood search to within this cluster. This method yields several benefits:

Memory Efficiency: By focusing on smaller subgraphs, Cluster-GCN reduces the need to store embeddings for the entire graph, drastically minimizing memory usage.
Improved Training Time: The localized nature of the clusters helps maintain a linear time complexity per epoch, which is substantially better than existing methods that suffer from an exponential increase in complexities with deeper layers.
Enhanced Scalability: The clustering approach ensures the model can handle significantly larger graphs, which would traditionally be infeasible with previous methods.

Experimental Validation

The paper's authors validate Cluster-GCN through an extensive set of experiments on various benchmark datasets, including PPI, Reddit, and a newly created Amazon2M dataset. The results demonstrate:

Memory Usage: Cluster-GCN uses up to 5 times less memory compared to VR-GCN when training a 3-layer GCN on the Amazon2M dataset.
Training Speed: For deeper network configurations, Cluster-GCN exhibits faster training times—for instance, achieving a 1523-second training duration for a 3-layer GCN on Amazon2M, compared to 1961 seconds for VR-GCN.
Model Accuracy: The paper reports state-of-the-art test F1 scores on datasets such as PPI (99.36) and Reddit (96.60), facilitated by deeper GCN training capabilities unlocked by Cluster-GCN.

Implications and Future Directions

The practical implications of Cluster-GCN are profound, particularly in applications where large-scale graphs are prevalent, such as social network analysis, protein-protein interactions, and recommendation systems. The method's ability to scale efficiently enables the training of more complex models, potentially leading to more accurate predictions.

Theoretically, the paper opens avenues for further research in optimizing GCN training algorithms. Potential future work could explore:

Adaptive Clustering Techniques: Refining clustering methods to dynamically adjust to the graph's changing structure during training.
Integration with Other GCN Variants: Adapting Cluster-GCN to work with advanced GCN architectures and potentially improving their efficiency.

In conclusion, Cluster-GCN represents a significant step forward in the efficient training of GCNs on large-scale datasets. The innovative use of graph clustering to optimize both memory and computational requirements paves the way for further advancements in the field of graph-based deep learning.

PDF Markdown Bookmark Chat (Pro)

Authors (6)

Wei-Lin Chiang (19 papers)
Xuanqing Liu (21 papers)
Si Si (24 papers)
Yang Li (1140 papers)
Samy Bengio (75 papers)
Cho-Jui Hsieh (211 papers)

Citations (1,182)

View on Semantic Scholar

Cluster-GCN: An Efficient Algorithm for Training Deep and Large Graph Convolutional Networks (1905.07953v2)

An Efficient Algorithm for Training Large-Scale Graph Convolutional Networks

Summary of Key Contributions

Algorithmic Innovations

Experimental Validation

Implications and Future Directions

Related Papers

GitHub

YouTube