GNNHLS: Evaluating Graph Neural Network Inference via High-Level Synthesis (2309.16022v1)
Abstract: With the ever-growing popularity of Graph Neural Networks (GNNs), efficient GNN inference is gaining tremendous attention. Field-Programming Gate Arrays (FPGAs) are a promising execution platform due to their fine-grained parallelism, low-power consumption, reconfigurability, and concurrent execution. Even better, High-Level Synthesis (HLS) tools bridge the gap between the non-trivial FPGA development efforts and rapid emergence of new GNN models. In this paper, we propose GNNHLS, an open-source framework to comprehensively evaluate GNN inference acceleration on FPGAs via HLS, containing a software stack for data generation and baseline deployment, and FPGA implementations of 6 well-tuned GNN HLS kernels. We evaluate GNNHLS on 4 graph datasets with distinct topologies and scales. The results show that GNNHLS achieves up to 50.8x speedup and 423x energy reduction relative to the CPU baselines. Compared with the GPU baselines, GNNHLS achieves up to 5.16x speedup and 74.5x energy reduction.
- S. Abi-Karam, Y. He, R. Sarkar, L. Sathidevi, Z. Qiao, and C. Hao, “GenGNN: A generic FPGA framework for graph neural network acceleration,” arXiv preprint arXiv:2201.08475, 2022.
- X. Bresson and T. Laurent, “Residual gated graph convnets,” arXiv preprint arXiv:1711.07553, 2017.
- N. Brown, “Exploring the acceleration of Nekbone on reconfigurable architectures,” in Proc. of IEEE/ACM Int’l Workshop on Heterogeneous High-performance Reconfigurable Computing, 2020, pp. 19–28.
- J. de Fine Licht, M. Besta, S. Meierhans, and T. Hoefler, “Transformations of high-level synthesis codes for high-performance computing,” IEEE Trans. Parallel Distrib. Syst., vol. 32, no. 5, pp. 1014–1029, 2020.
- A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” J. Royal Statistical Society: Series B (Methodological), vol. 39, no. 1, pp. 1–22, 1977.
- Z. Dong, W. Cao, M. Zhang, D. Tao, Y. Chen, and X. Zhang, “CktGNN: Circuit graph neural network for electronic design automation,” arXiv preprint arXiv:2308.16406, 2023.
- V. P. Dwivedi, C. K. Joshi, T. Laurent, Y. Bengio, and X. Bresson, “Benchmarking graph neural networks,” Journal of Machine Learning Research, vol. 23, 2020.
- M. Fey and J. E. Lenssen, “Fast graph representation learning with PyTorch Geometric,” arXiv preprint arXiv:1903.02428, 2019.
- T. Geng, A. Li, R. Shi, C. Wu, T. Wang, Y. Li, P. Haghi, A. Tumeo, S. Che, S. Reinhardt, and M. C. Herbordt, “AWB-GCN: A graph convolutional network accelerator with runtime workload rebalancing,” in Proc. of 53rd Int’l Symp. on Microarchitecture, 2020, pp. 922–936.
- W. Hamilton, Z. Ying, and J. Leskovec, “Inductive representation learning on large graphs,” Adv. Neural Inf. Process. Syst., vol. 30, 2017.
- W. Hu, M. Fey, M. Zitnik, Y. Dong, H. Ren, B. Liu, M. Catasta, and J. Leskovec, “Open graph benchmark: Datasets for machine learning on graphs,” Adv. Neural Inf. Process. Syst., vol. 33, 2020.
- T. N. Kipf and M. Welling, “Semi-supervised classification with graph convolutional networks,” in Proc. of Int’l Conf. on Learning Rep., 2017.
- M. Leeser, S. Handagala, and M. Zink, “FPGAs in the cloud,” Computing in Science & Engineering, vol. 23, no. 6, pp. 72–76, 2021.
- J. Leskovec and C. Faloutsos, “Sampling from large graphs,” in Proc. of 12th ACM SIGKDD Int’l Conf. on Knowledge Discovery and Data Mining, 2006, pp. 631–636.
- Y. C. Lin, B. Zhang, and V. Prasanna, “GCN inference acceleration using high-level synthesis,” in Proc. of IEEE High Performance Extreme Computing Conf., 2021.
- F. Monti, D. Boscaini, J. Masci, E. Rodola, J. Svoboda, and M. M. Bronstein, “Geometric deep learning on graphs and manifolds using mixture model CNNs,” in Proc. of IEEE Conf. on Computer Vision and Pattern Recognition, 2017, pp. 5115–5124.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan et al., “PyTorch: An imperative style, high-performance deep learning library,” Adv. Neural Inf. Process. Syst., vol. 32, 2019.
- Y. S. Shao and D. Brooks, “ISA-independent workload characterization and its implications for specialized architectures,” in Proc. of IEEE Int’l Symp. on Perf. Analysis of Systems and Software, 2013, pp. 245–255.
- P. Veličković, G. Cucurull, A. Casanova, A. Romero, P. Lio, and Y. Bengio, “Graph attention networks,” in Proc. of Int’l Conf. on Learning Rep., 2017.
- M. Y. Wang, “Deep graph library: Towards efficient and scalable deep learning on graphs,” in Proc. of ICLR Workshop on Representation Learning on Graphs and Manifolds, 2019.
- J. Weinberg, M. O. McCracken, E. Strohmaier, and A. Snavely, “Quantifying locality in the memory access patterns of HPC applications,” in Proc. of ACM/IEEE Conf. on Supercomputing, 2005.
- B. Weisfeiler and A. Leman, “The reduction of a graph to canonical form and the algebra which appears therein,” Nauchno-Technicheskaya Informatsia, vol. 2, no. 9, pp. 12–16, 1968.
- K. Xu, W. Hu, J. Leskovec, and S. Jegelka, “How powerful are graph neural networks?” in Proc. of Int’l Conf. on Learning Rep., 2019.
- B. Zhang, R. Kannan, and V. Prasanna, “BoostGCN: A framework for optimizing GCN inference on FPGA,” in Proc. of 29th Int’l Symp. on Field-Programmable Custom Computing Machines, 2021, pp. 29–39.
- M. Zhang, Z. Cui, M. Neumann, and Y. Chen, “An end-to-end deep learning architecture for graph classification,” in Proc. of AAAI Conf. on Artificial Intelligence, vol. 32, no. 1, 2018.
- C. Zhao, R. D. Chamberlain, and X. Zhang, “Supercut: Communication-aware partitioning for near-memory graph processing,” in Proc. of 20th ACM Int’l Conf. on Computing Frontiers, 2023, pp. 44–53.
- C. Zhao, Z. Dong, Y. Chen, X. Zhang, and R. D. Chamberlain, “GNNHLS: Evaluating Graph Neural Network Inference via High-Level Synthesis,” in Proc. of 41st IEEE Int’l Conf. on Computer Design, Nov. 2023.
- ——, “Graph Neural Network High-Level Synthesis Benchmark Suite V1,” https://doi.org/10.7936/6RXS-103645, Sep. 2023.