Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

FPGA Based Accelerator for Neural Networks Computation with Flexible Pipelining (2112.15443v1)

Published 28 Dec 2021 in cs.AR

Abstract: FPGA is appropriate for fix-point neural networks computing due to high power efficiency and configurability. However, its design must be intensively refined to achieve high performance using limited hardware resources. We present an FPGA-based neural networks accelerator and its optimization framework, which can achieve optimal efficiency for various CNN models and FPGA resources. Targeting high throughput, we adopt layer-wise pipeline architecture for higher DSP utilization. To get the optimal performance, a flexible algorithm to allocate balanced hardware resources to each layer is also proposed, supported by activation buffer design. Through our well-balanced implementation of four CNN models on ZC706, the DSP utilization and efficiency are over 90%. For VGG16 on ZC706, the proposed accelerator achieves the performance of 2.58x, 1.53x and 1.35x better than the referenced non-pipeline architecture [1], pipeline architecture [2] and [3], respectively.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (3)
  1. Qingyang Yi (1 paper)
  2. Heming Sun (39 papers)
  3. Masahiro Fujita (6 papers)
Citations (2)

Summary

We haven't generated a summary for this paper yet.