Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
119 tokens/sec
GPT-4o
56 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
6 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Compression-aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations (2110.08828v1)

Published 17 Oct 2021 in cs.CV, cs.LG, and eess.SP

Abstract: Convolutional neural networks (CNNs) achieve remarkable performance in a wide range of fields. However, intensive memory access of activations introduces considerable energy consumption, impeding deployment of CNNs on resourceconstrained edge devices. Existing works in activation compression propose to transform feature maps for higher compressibility, thus enabling dimension reduction. Nevertheless, in the case of aggressive dimension reduction, these methods lead to severe accuracy drop. To improve the trade-off between classification accuracy and compression ratio, we propose a compression-aware projection system, which employs a learnable projection to compensate for the reconstruction loss. In addition, a greedy selection metric is introduced to optimize the layer-wise compression ratio allocation by considering both accuracy and #bits reduction simultaneously. Our test results show that the proposed methods effectively reduce 2.91x~5.97x memory access with negligible accuracy drop on MobileNetV2/ResNet18/VGG16.

User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (4)
  1. Yu-Shan Tai (4 papers)
  2. Chieh-Fang Teng (11 papers)
  3. Cheng-Yang Chang (4 papers)
  4. An-Yeu Wu (12 papers)
Citations (7)

Summary

We haven't generated a summary for this paper yet.