Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
41 tokens/sec
GPT-4o
59 tokens/sec
Gemini 2.5 Pro Pro
41 tokens/sec
o3 Pro
7 tokens/sec
GPT-4.1 Pro
50 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Auto-ViT-Acc: An FPGA-Aware Automatic Acceleration Framework for Vision Transformer with Mixed-Scheme Quantization (2208.05163v1)

Published 10 Aug 2022 in cs.CV, cs.LG, and eess.IV

Abstract: Vision transformers (ViTs) are emerging with significantly improved accuracy in computer vision tasks. However, their complex architecture and enormous computation/storage demand impose urgent needs for new hardware accelerator design methodology. This work proposes an FPGA-aware automatic ViT acceleration framework based on the proposed mixed-scheme quantization. To the best of our knowledge, this is the first FPGA-based ViT acceleration framework exploring model quantization. Compared with state-of-the-art ViT quantization work (algorithmic approach only without hardware acceleration), our quantization achieves 0.47% to 1.36% higher Top-1 accuracy under the same bit-width. Compared with the 32-bit floating-point baseline FPGA accelerator, our accelerator achieves around 5.6x improvement on the frame rate (i.e., 56.8 FPS vs. 10.0 FPS) with 0.71% accuracy drop on ImageNet dataset for DeiT-base.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (12)
  1. Zhengang Li (31 papers)
  2. Mengshu Sun (41 papers)
  3. Alec Lu (4 papers)
  4. Haoyu Ma (45 papers)
  5. Geng Yuan (58 papers)
  6. Yanyue Xie (12 papers)
  7. Hao Tang (378 papers)
  8. Yanyu Li (31 papers)
  9. Miriam Leeser (10 papers)
  10. Zhangyang Wang (374 papers)
  11. Xue Lin (92 papers)
  12. Zhenman Fang (21 papers)
Citations (46)