Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
139 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization (2210.00102v3)

Published 30 Sep 2022 in cs.LG and cs.SI

Abstract: Training graph neural networks (GNNs) on large graphs is complex and extremely time consuming. This is attributed to overheads caused by sparse matrix multiplication, which are sidestepped when training multi-layer perceptrons (MLPs) with only node features. MLPs, by ignoring graph context, are simple and faster for graph data, however they usually sacrifice prediction accuracy, limiting their applications for graph data. We observe that for most message passing-based GNNs, we can trivially derive an analog MLP (we call this a PeerMLP) with an equivalent weight space, by setting the trainable parameters with the same shapes, making us curious about \textbf{\emph{how do GNNs using weights from a fully trained PeerMLP perform?}} Surprisingly, we find that GNNs initialized with such weights significantly outperform their PeerMLPs, motivating us to use PeerMLP training as a precursor, initialization step to GNN training. To this end, we propose an embarrassingly simple, yet hugely effective initialization method for GNN training acceleration, called MLPInit. Our extensive experiments on multiple large-scale graph datasets with diverse GNN architectures validate that MLPInit can accelerate the training of GNNs (up to 33X speedup on OGB-Products) and often improve prediction performance (e.g., up to $7.97\%$ improvement for GraphSAGE across $7$ datasets for node classification, and up to $17.81\%$ improvement across $4$ datasets for link prediction on metric Hits@10). The code is available at \href{https://github.com/snap-research/MLPInit-for-GNNs}.

Citations (32)

Summary

  • The paper introduces MLPInit as an innovative method that leverages MLP-GNN weight equivalency to accelerate training.
  • It demonstrates improved performance with up to 7.97% accuracy gain in node classification and 17.81% in link prediction tasks.
  • The method achieves a remarkable speedup by bypassing costly neighbor aggregation, reducing training time significantly on large datasets.

MLPInit: Simple GNN Training Acceleration via MLP Initialization

The paper titled "MLPInit: Embarrassingly Simple GNN Training Acceleration with MLP Initialization" addresses the challenging problem of efficiently training Graph Neural Networks (GNNs) on large-scale graphs, by introducing an innovative initialization strategy based on Multi-Layer Perceptrons (MLPs). Graph Neural Networks have shown efficacy across various practical tasks, such as recommendation systems, knowledge graph analysis, and chemistry applications. However, training such networks on massive graph datasets often presents significant computational overhead, primarily resulting from operations involving sparse matrix multiplications.

The core proposition of this paper lies in leveraging the weight equivalency between MLPs and GNNs. This equivalency arises from the observation that MLPs can be constructed to share the same weight space as GNNs, allowing for efficient training transfer between these models. The paper proposes an initialization method called MLPInit, which utilizes this weight parity for GNN training acceleration. Initially, MLPs are trained on node features without considering the graph structure, thus avoiding costly sparse operations. These trained weights from MLPs are then used to initialize the GNNs, facilitating a quicker convergence during subsequent GNN training.

Several key findings substantiate this methodology:

  • Training Efficiency: MLPs are computationally cheaper to train than GNNs because they bypass the costly neighbor aggregation operations. MLP training provides rapid convergence, serving effectively as a precursor to GNN training.
  • Performance Improvements: Initialized with MLP-trained weights, GNNs achieve significant improvements in prediction accuracy. Empirical evaluations show up to 7.97% improvement for node classification tasks and up to 17.81% on link prediction tasks (using Hits@10).
  • Training Speedup: The proposed strategy can accelerate GNN training remarkably, with reported speedups up to 33× on large datasets like OGB-products.

The implications of this research are noteworthy, providing both practical and theoretical insights into GNN optimization. MLPInit serves as a valuable technique for enhancing the scalability of GNN models, leading to efficient utilization of computational resources and reduced training times on large graph datasets. Furthermore, the paper notes that MLPInit is orthogonal to other acceleration strategies, presenting potential for cumulative optimization benefits when combined with techniques like graph sparsification or weight quantization.

Looking ahead, future research could explore extending MLPInit’s application across different GNN architectures and domains. Additionally, considerations around optimizing the balance between MLP training epochs and GNN fine-tuning may yield further efficiency gains. As AI continues to tackle increasingly complex graph datasets, methods such as MLPInit offer promising directions to manage the associated computational demands.

Overall, this paper exemplifies a straightforward yet effective approach to address the computational challenges associated with GNN training, promising impactful advancements in graph-based AI applications. The paper solidifies the potential of initializing GNNs through MLPs as a cost-effective solution for both research and real-world systems utilizing graph-based neural networks.