FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling (1801.10247v1)

Published 30 Jan 2018 in cs.LG

Abstract: The graph convolutional networks (GCN) recently proposed by Kipf and Welling are an effective graph model for semi-supervised learning. This model, however, was originally designed to be learned with the presence of both training and test data. Moreover, the recursive neighborhood expansion across layers poses time and memory challenges for training with large, dense graphs. To relax the requirement of simultaneous availability of test data, we interpret graph convolutions as integral transforms of embedding functions under probability measures. Such an interpretation allows for the use of Monte Carlo approaches to consistently estimate the integrals, which in turn leads to a batched training scheme as we propose in this work---FastGCN. Enhanced with importance sampling, FastGCN not only is efficient for training but also generalizes well for inference. We show a comprehensive set of experiments to demonstrate its effectiveness compared with GCN and related models. In particular, training is orders of magnitude more efficient while predictions remain comparably accurate.

Authors (3)

Jie Chen (602 papers)
Tengfei Ma (73 papers)
Cao Xiao (84 papers)

Citations (1,420)

View on Semantic Scholar

Summary

The paper introduces FastGCN, which reframes graph convolutions as integral transforms to enable efficient batched training through Monte Carlo importance sampling.
It employs a novel sampling strategy that reduces variance by aligning the probability measure with the adjacency matrix’s squared norms, cutting computational overhead.
Experimental results on datasets like Reddit and Pubmed show that FastGCN maintains competitive accuracy while drastically reducing training time for large-scale graphs.

FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling

Graph Convolutional Networks (GCNs), as proposed by Kipf and Welling, have shown substantial promise for various graph-related learning tasks, primarily in the domain of semi-supervised learning. However, traditional GCNs are hampered by two significant practical constraints: the necessity of having complete access to both training and test data during the learning phase and the computational burden posed by recursive neighborhood expansion across graph layers. This paper, titled "FastGCN: Fast Learning with Graph Convolutional Networks via Importance Sampling" by Jie Chen, Tengfei Ma, and Cao Xiao, introduces a novel approach to address these limitations.

Methodological Contributions

The core innovation presented in this paper is the FastGCN methodology, which reframes graph convolutions as integral transforms of embedding functions under probability measures. This perspective permits the application of Monte Carlo sampling techniques to approximate these integrals effectively, thus enabling a batched training process. Enhanced with importance sampling, FastGCN significantly optimizes both training efficiency and inference generalization.

Key methodological advancements include:

Integral Transform Perspective:
- The authors reinterpret graph convolutions as integral transforms, allowing for a consistent evaluation of integrals using Monte Carlo sampling. This approach supports inductive learning and effectively separates training and test data, a critical requirement for dynamically growing graphs.
Importance Sampling for Variance Reduction:
- An improved sampling method is proposed, which significantly reduces variance through importance sampling. By redefining the probability measure to be proportional to the adjacency matrix's squared norms, the authors mitigate the computational inefficiencies traditionally associated with recursive neighborhood expansion in GCNs.
Batched Training Algorithm:
- FastGCN introduces a batched training algorithm where the computational cost per batch remains controllable. The authors provide a rigorous theoretical backing through convergence proofs, ensuring that the gradient-based optimization remains consistent despite the inherent sampling noise.

Experimental Results

The proposed FastGCN approach was evaluated against traditional GCNs and GraphSAGE on benchmark datasets: Cora, Pubmed, and Reddit. The empirical analysis highlights:

Efficiency:
- FastGCN demonstrates a significant reduction in training time, often outperforming GraphSAGE by orders of magnitude. For instance, on Reddit, FastGCN achieves a per-batch computation time notably lower than both GraphSAGE and standard GCNs.
Accuracy:
- Despite the aggressive reduction in computational resources, FastGCN maintains competitive classification accuracy. On Pubmed, FastGCN achieves an accuracy of 0.880 compared to 0.849 by GraphSAGE-GCN and 0.867 by batched GCN.
Scalability:
- FastGCN proves particularly advantageous for large, dense graphs like Reddit, where traditional approaches either fail due to memory constraints or suffer from prohibitive computational overhead.

Implications and Future Directions

The implications of FastGCN are multifaceted, impacting both theoretical and practical domains in graph-based learning.

Theoretical Implications:

The integral transform perspective combined with Monte Carlo sampling presents a promising framework for extending GCN architectures. This approach can potentially be generalized to other graph models that rely on neighborhood aggregation, paving the way for future research in efficient graph learning methodologies.

Practical Implications:

The ability to efficiently train GCNs without requiring simultaneous access to test data is crucial for applications in dynamically evolving systems such as social networks or recommendation systems. FastGCN can thus facilitate real-time and scalable graph learning in such scenarios.

Speculative Future Directions:

Future research could explore optimizing the importance sampling further, possibly integrating adaptive sampling methods or advanced variance reduction techniques.
Extending this framework to handle heterogeneous or multi-modal graphs would be a worthwhile pursuit to address more complex graph learning tasks.
Investigating the integration of FastGCN with other types of neural network architectures, particularly for tasks beyond node classification, such as link prediction or graph generation, could yield substantial advancements in the graph neural network domain.

In conclusion, FastGCN marks a significant step towards more efficient and scalable graph learning methods, addressing the critical limitations of conventional GCNs through a principled and theoretically sound approach. The empirical results reaffirm its practical utility and open avenues for extensive future research in graph-based learning.

PDF Markdown

Related Papers

YouTube

Show All Videos