Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
102 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Design Flow of Accelerating Hybrid Extremely Low Bit-width Neural Network in Embedded FPGA (1808.04311v2)

Published 31 Jul 2018 in cs.DC, cs.CV, and cs.LG

Abstract: Neural network accelerators with low latency and low energy consumption are desirable for edge computing. To create such accelerators, we propose a design flow for accelerating the extremely low bit-width neural network (ELB-NN) in embedded FPGAs with hybrid quantization schemes. This flow covers both network training and FPGA-based network deployment, which facilitates the design space exploration and simplifies the tradeoff between network accuracy and computation efficiency. Using this flow helps hardware designers to deliver a network accelerator in edge devices under strict resource and power constraints. We present the proposed flow by supporting hybrid ELB settings within a neural network. Results show that our design can deliver very high performance peaking at 10.3 TOPS and classify up to 325.3 image/s/watt while running large-scale neural networks for less than 5W using embedded FPGA. To the best of our knowledge, it is the most energy efficient solution in comparison to GPU or other FPGA implementations reported so far in the literature.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Junsong Wang (8 papers)
  2. Qiuwen Lou (7 papers)
  3. Xiaofan Zhang (79 papers)
  4. Chao Zhu (51 papers)
  5. Yonghua Lin (13 papers)
  6. Deming Chen (62 papers)
Citations (92)