Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
110 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

ScalaBFS: A Scalable BFS Accelerator on HBM-Enhanced FPGAs (2105.11754v2)

Published 25 May 2021 in cs.AR

Abstract: High Bandwidth Memory (HBM) provides massive aggregated memory bandwidth by exposing multiple memory channels to the processing units. To achieve high performance, an accelerator built on top of an FPGA configured with HBM (i.e., FPGA-HBM platform) needs to scale its performance according to the available memory channels. In this paper, we propose an accelerator for BFS (Breadth-First Search) algorithm, named as ScalaBFS, that builds multiple processing elements to sufficiently exploit the high bandwidth of HBM to improve efficiency. We implement the prototype system of ScalaBFS and conduct BFS in both real-world and synthetic scale-free graphs on Xilinx Alveo U280 FPGA card real hardware. The experimental results show that ScalaBFS scales its performance almost linearly according to the available memory pseudo channels (PCs) from the HBM2 subsystem of U280. By fully using the 32 PCs and building 64 processing elements (PEs) on U280, ScalaBFS achieves a performance up to 19.7 GTEPS (Giga Traversed Edges Per Second). When conducting BFS in sparse real-world graphs, ScalaBFS achieves equivalent GTEPS to Gunrock running on the state-of-art Nvidia V100 GPU that features 64-PC HBM2 (twice memory bandwidth than U280).

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (8)
  1. Kexin Li (24 papers)
  2. Chenhao Liu (2 papers)
  3. Zhiyuan Shao (4 papers)
  4. Zeke Wang (17 papers)
  5. Minkang Wu (1 paper)
  6. Jiajie Chen (31 papers)
  7. Xiaofei Liao (11 papers)
  8. Hai Jin (83 papers)
Citations (6)

Summary

We haven't generated a summary for this paper yet.