Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Towards Design Space Exploration and Optimization of Fast Algorithms for Convolutional Neural Networks (CNNs) on FPGAs (1903.01811v1)

Published 5 Mar 2019 in eess.SP, cs.CV, and eess.IV

Abstract: Convolutional Neural Networks (CNNs) have gained widespread popularity in the field of computer vision and image processing. Due to huge computational requirements of CNNs, dedicated hardware-based implementations are being explored to improve their performance. Hardware platforms such as Field Programmable Gate Arrays (FPGAs) are widely being used to design parallel architectures for this purpose. In this paper, we analyze Winograd minimal filtering or fast convolution algorithms to reduce the arithmetic complexity of convolutional layers of CNNs. We explore a complex design space to find the sets of parameters that result in improved throughput and power-efficiency. We also design a pipelined and parallel Winograd convolution engine that improves the throughput and power-efficiency while reducing the computational complexity of the overall system. Our proposed designs show up to 4.75$\times$ and 1.44$\times$ improvements in throughput and power-efficiency, respectively, in comparison to the state-of-the-art design while using approximately 2.67$\times$ more multipliers. Furthermore, we obtain savings of up to 53.6\% in logic resources compared with the state-of-the-art implementation.

Citations (11)

Summary

We haven't generated a summary for this paper yet.