Papers
Topics
Authors
Recent
Search
2000 character limit reached

Convolutional Neural Networks Quantization with Attention

Published 30 Sep 2022 in cs.AI and cs.CV | (2209.15317v1)

Abstract: It has been proven that, compared to using 32-bit floating-point numbers in the training phase, Deep Convolutional Neural Networks (DCNNs) can operate with low precision during inference, thereby saving memory space and power consumption. However, quantizing networks is always accompanied by an accuracy decrease. Here, we propose a method, double-stage Squeeze-and-Threshold (double-stage ST). It uses the attention mechanism to quantize networks and achieve state-of-art results. Using our method, the 3-bit model can achieve accuracy that exceeds the accuracy of the full-precision baseline model. The proposed double-stage ST activation quantization is easy to apply: inserting it before the convolution.

Citations (1)

Summary

No one has generated a summary of this paper yet.

Paper to Video (Beta)

No one has generated a video about this paper yet.

Whiteboard

No one has generated a whiteboard explanation for this paper yet.

Open Problems

We haven't generated a list of open problems mentioned in this paper yet.

Continue Learning

We haven't generated follow-up questions for this paper yet.

Collections

Sign up for free to add this paper to one or more collections.