Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

HybridDNN: A Framework for High-Performance Hybrid DNN Accelerator Design and Implementation (2004.03804v1)

Published 8 Apr 2020 in cs.AR and cs.CV

Abstract: To speedup Deep Neural Networks (DNN) accelerator design and enable effective implementation, we propose HybridDNN, a framework for building high-performance hybrid DNN accelerators and delivering FPGA-based hardware implementations. Novel techniques include a highly flexible and scalable architecture with a hybrid Spatial/Winograd convolution (CONV) Processing Engine (PE), a comprehensive design space exploration tool, and a complete design flow to fully support accelerator design and implementation. Experimental results show that the accelerators generated by HybridDNN can deliver 3375.7 and 83.3 GOPS on a high-end FPGA (VU9P) and an embedded FPGA (PYNQ-Z1), respectively, which achieve a 1.8x higher performance improvement compared to the state-of-art accelerator designs. This demonstrates that HybridDNN is flexible and scalable and can target both cloud and embedded hardware platforms with vastly different resource constraints.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (5)
  1. Hanchen Ye (9 papers)
  2. Xiaofan Zhang (79 papers)
  3. Zhize Huang (1 paper)
  4. Gengsheng Chen (1 paper)
  5. Deming Chen (62 papers)
Citations (58)

Summary

We haven't generated a summary for this paper yet.