Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FastSample: Accelerating Distributed Graph Neural Network Training for Billion-Scale Graphs (2311.17847v1)

Published 29 Nov 2023 in cs.DC

Abstract: Training Graph Neural Networks(GNNs) on a large monolithic graph presents unique challenges as the graph cannot fit within a single machine and it cannot be decomposed into smaller disconnected components. Distributed sampling-based training distributes the graph across multiple machines and trains the GNN on small parts of the graph that are randomly sampled every training iteration. We show that in a distributed environment, the sampling overhead is a significant component of the training time for large-scale graphs. We propose FastSample which is composed of two synergistic techniques that greatly reduce the distributed sampling time: 1)a new graph partitioning method that eliminates most of the communication rounds in distributed sampling , 2)a novel highly optimized sampling kernel that reduces memory movement during sampling. We test FastSample on large-scale graph benchmarks and show that FastSample speeds up distributed sampling-based GNN training by up to 2x with no loss in accuracy.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (33)
  1. Graph convolutional neural networks for web-scale recommender systems. In Proceedings of the 24th ACM SIGKDD international conference on knowledge discovery & data mining, pages 974–983, 2018.
  2. Freebase: a collaboratively created graph database for structuring human knowledge. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data, pages 1247–1250, 2008.
  3. The open catalyst 2022 (oc22) dataset and challenges for oxide electrocatalysts. ACS Catalysis, 13(5):3066–3084, 2023.
  4. Geometric deep learning: going beyond euclidean data. IEEE Signal Processing Magazine, 34(4):18–42, 2017.
  5. Graph neural networks: A review of methods and applications. AI open, 1:57–81, 2020.
  6. A comprehensive survey on graph neural networks. IEEE transactions on neural networks and learning systems, 32(1):4–24, 2020.
  7. Distdgl: distributed graph neural network training for billion-scale graphs. In 2020 IEEE/ACM 10th Workshop on Irregular Applications: Architectures and Algorithms (IA3), pages 36–44. IEEE, 2020.
  8. Distributed hybrid cpu and gpu training for graph neural networks on billion-scale heterogeneous graphs. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, pages 4582–4591, 2022.
  9. Hesham Mostafa. Sequential aggregation and rematerialization: Distributed full-batch training of graph neural networks on large graphs. Proceedings of Machine Learning and Systems, 4:265–275, 2022.
  10. Metis: A software package for partitioning unstructured graphs, partitioning meshes, and computing fill-reducing orderings of sparse matrices. 1997.
  11. Reducing communication in graph neural network training. In SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2020.
  12. Distgnn: Scalable distributed training for large-scale graph neural networks. arXiv preprin tarXiv:2104.06700, 2021.
  13. Scalable and efficient full-graph gnn training for large graphs. Proceedings of the ACM on Management of Data, 1(2):1–23, 2023.
  14. Wholegraph: a fast graph neural network training framework with multi-gpu distributed shared memory architecture. In SC22: International Conference for High Performance Computing, Networking, Storage and Analysis, pages 1–14. IEEE, 2022.
  15. Graphsaint: Graph sampling based inductive learning method. arXiv preprint arXiv:1907.04931, 2019.
  16. Cluster-gcn: An efficient algorithm for training deep and large graph convolutional networks. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, pages 257–266, 2019.
  17. Efficient neighbor-sampling-based gnn training on cpu-fpga heterogeneous platform. In 2021 IEEE High Performance Extreme Computing Conference (HPEC), pages 1–7. IEEE, 2021.
  18. Sampling methods for efficient training of graph convolutional networks: A survey. IEEE/CAA Journal of Automatica Sinica, 9(2):205–234, 2021.
  19. Deep graph library: Towards efficient and scalable deep learning on graphs. 2019.
  20. Fast graph representation learning with pytorch geometric. arxiv 2019. arXiv preprint arXiv:1903.02428, 1903.
  21. Quiver: Supporting gpus for low-latency, high-throughput gnn serving with workload awareness. arXiv preprint arXiv:2305.10863, 2023.
  22. Aligraph: A comprehensive graph neural network platform. arXiv preprint arXiv:1902.08730, 2019.
  23. Large graph convolutional network training with gpu-oriented data communication architecture. arXiv preprint arXiv:2103.03330, 2021.
  24. Neugraph: parallel deep neural network computation on large graphs. In 2019 {normal-{\{{USENIX}normal-}\}} Annual Technical Conference ({normal-{\{{USENIX}normal-}\}}{normal-{\{{ATC}normal-}\}} 19), pages 443–458, 2019.
  25. Accelerating training and inference of graph neural networks with fast sampling and pipelining. Proceedings of Machine Learning and Systems, 4:172–189, 2022.
  26. P3: Distributed deep graph learning at scale. In 15th {normal-{\{{USENIX}normal-}\}} Symposium on Operating Systems Design and Implementation ({normal-{\{{OSDI}normal-}\}} 21), pages 551–568, 2021.
  27. Global neighbor sampling for mixed cpu-gpu training on giant graphs. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining, pages 289–299, 2021.
  28. Gcn meets gpu: Decoupling “when to sample” from “how to sample”. Advances in Neural Information Processing Systems, 33:18482–18492, 2020.
  29. Layer-dependent importance sampling for training deep and large graph convolutional networks. Advances in neural information processing systems, 32, 2019.
  30. Ogb-lsc: A large-scale challenge for machine learning on graphs. arXiv preprint arXiv:2103.09430, 2021.
  31. Igb: Addressing the gaps in labeling, features, heterogeneity, and size of public graph datasets for deep learning research. arXiv preprint arXiv:2302.13522, 2023.
  32. torch_ccl. https://github.com/intel/torch-ccl. Accessed: 2021-10-5.
  33. Oneccl. https://github.com/oneapi-src/oneCCL. Accessed: 2021-10-5.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Hesham Mostafa (26 papers)
  2. Adam Grabowski (2 papers)
  3. Md Asadullah Turja (4 papers)
  4. Juan Cervino (16 papers)
  5. Alejandro Ribeiro (281 papers)
  6. Nageen Himayat (24 papers)
Citations (1)

Summary

We haven't generated a summary for this paper yet.