Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

SASA: A Scalable and Automatic Stencil Acceleration Framework for Optimized Hybrid Spatial and Temporal Parallelism on HBM-based FPGAs (2208.10770v1)

Published 23 Aug 2022 in cs.AR

Abstract: Stencil computation is one of the fundamental computing patterns in many application domains such as scientific computing and image processing. While there are promising studies that accelerate stencils on FPGAs, there lacks an automated acceleration framework to systematically explore both spatial and temporal parallelisms for iterative stencils that could be either computation-bound or memory-bound. In this paper, we present SASA, a scalable and automatic stencil acceleration framework on modern HBM-based FPGAs. SASA takes the high-level stencil DSL and FPGA platform as inputs, automatically exploits the best spatial and temporal parallelism configuration based on our accurate analytical model, and generates the optimized FPGA design with the best parallelism configuration in TAPA high-level synthesis C++ as well as its corresponding host code. Compared to state-of-the-art automatic stencil acceleration framework SODA that only exploits temporal parallelism, SASA achieves an average speedup of 3.74x and up to 15.73x speedup on the HBM-based Xilinx Alveo U280 FPGA board for a wide range of stencil kernels.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Xingyu Tian (2 papers)
  2. Zhifan Ye (12 papers)
  3. Alec Lu (4 papers)
  4. Licheng Guo (9 papers)
  5. Yuze Chi (14 papers)
  6. Zhenman Fang (21 papers)
Citations (8)

Summary

We haven't generated a summary for this paper yet.