Layer-Dependent Importance Sampling for Training Deep and Large Graph Convolutional Networks
The paper introduces LADIES, a novel layer-dependent importance sampling algorithm specifically tailored for training extensive and intricate Graph Convolutional Networks (GCNs). The increasing popularity of GCNs, due to their advantageous application across multiple graph-related tasks, brings with it computational challenges, especially when dealing with large-scale graphs. Traditional full-batch GCN training approaches are computationally intensive, requiring significant memory and processing power as they compute representations for all nodes within a network layer. This issue has given rise to the development of sampling-based techniques aimed at mitigating these resource requirements.
Existing sampling techniques, such as node-wise neighbor sampling and layer-wise importance sampling, present their respective shortcomings. Node-wise sampling suffers from an exponential growth in the number of neighbor nodes per layer, leading to increased computational demands. Conversely, layer-wise methods tend to lose efficacy in sparse graphs due to the absence of neighbor-dependent constraints. LADIES endeavors to address these concerns using a layer-specific sampling approach that maximizes computational efficiency while retaining model accuracy.
The LADIES algorithm operates by selecting nodes from upper layers and sampling their neighborhood nodes. Through the formation of a bipartite subgraph and calculation of importance probabilities, LADIES facilitates a recursive sampling process that efficiently constructs a complete computation graph. Theoretical and experimental benchmarks reveal that LADIES surpasses previous methodologies in both memory and time efficiency. Moreover, due to its stochastic characteristics, the algorithm achieves superior generalization accuracy over the original full-batch GCN.
Advancing beyond conventional sampling approaches such as those employed in GraphSAGE and FastGCN, LADIES calibrates sampling probabilities dynamically across layers, resulting in a denser computational graph. This confers a dual advantage of reduced sampling complexity and enhanced convergence performance. The layer-dependent importance sampling underpins LADIES’ ability to precisely calibrate the probability of selecting relevant nodes, reducing computational overhead while concurrently ensuring comprehensive graph coverage.
This research holds foundational implications for both theoretical and practical realms of GCN optimization. The reduced memory and processing demands align with the burgeoning requirements of applications that utilize large-scale graph data. From a theoretical standpoint, LADIES enriches the body of knowledge surrounding efficient deep learning model training within the graph data domain. Looking forward, the methodologies established by LADIES may inspire subsequent advances in sampling strategies, potentially expanding into broader applications beyond graph convolutional networks.
In conclusion, LADIES marks a pivotal development in optimizing the training of GCNs, addressing critical computational bottlenecks while fostering enhanced generalization capabilities. Its focus on layer-dependent sampling points towards new directions in both algorithmic development and the application of GCNs in data-intensive environments.