Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
126 tokens/sec
GPT-4o
47 tokens/sec
Gemini 2.5 Pro Pro
43 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

PowerPruning: Selecting Weights and Activations for Power-Efficient Neural Network Acceleration (2303.13997v2)

Published 24 Mar 2023 in cs.NE and cs.AI

Abstract: Deep neural networks (DNNs) have been successfully applied in various fields. A major challenge of deploying DNNs, especially on edge devices, is power consumption, due to the large number of multiply-and-accumulate (MAC) operations. To address this challenge, we propose PowerPruning, a novel method to reduce power consumption in digital neural network accelerators by selecting weights that lead to less power consumption in MAC operations. In addition, the timing characteristics of the selected weights together with all activation transitions are evaluated. The weights and activations that lead to small delays are further selected. Consequently, the maximum delay of the sensitized circuit paths in the MAC units is reduced even without modifying MAC units, which thus allows a flexible scaling of supply voltage to reduce power consumption further. Together with retraining, the proposed method can reduce power consumption of DNNs on hardware by up to 78.3% with only a slight accuracy loss.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (18)
  1. W.-L. Chen et al., “RiceTalk: Rice blast detection using internet of things and artificial intelligence technologies,” IEEE Internet of Things Journal, vol. 7, no. 2, 2020.
  2. S. Hassantabar et al., “MHDeep: Mental health disorder detection system based on wearable sensors and artificial neural networks,” ACM Trans. Embed. Comput. Syst., 2022.
  3. S. Han et al., “Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding,” in International Conference on Learning Representations (ICLR), 2016.
  4. W. Wen et al., “Learning structured sparsity in deep neural networks,” in International Conference on Neural Information Processing Systems (NIPS), 2016.
  5. B. Jacob et al., “Quantization and training of neural networks for efficient integer-arithmetic-only inference,” in IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
  6. A. Gholami et al., “A survey of quantization methods for efficient neural network inference,” CoRR, 2021.
  7. N. P. Jouppi et al., “In-datacenter performance analysis of a tensor processing unit,” in International Symposium on Computer Architecture (ISCA), 2017.
  8. “Edge TPU,” https://cloud.google.com/edge-tpu.
  9. Y.-H. Chen et al., “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits (ISSCC), vol. 52, no. 1, 2017.
  10. N. D. Gundi et al., “EFFORT: Enhancing energy efficiency and error resilience of a near-threshold tensor processing unit,” in Asia and South Pacific Design Automation Conference (ASP-DAC), 2020.
  11. P. Pandey et al., “UPTPU: Improving energy efficiency of a tensor processing unit through underutilization based power-gating,” in ACM/IEEE Design Automation Conference (DAC), 2021.
  12. V. Akhlaghi et al., “SnaPEA: Predictive early activation for reducing computation in deep convolutional neural networks,” International Symposium on Computer Architecture (ISCA), 2018.
  13. P. Pandey et al., “GreenTPU: Improving timing error resilience of a near-threshold tensor processing unit,” in ACM/IEEE Design Automation Conference (DAC), 2019.
  14. B. Reagen et al., “Minerva: Enabling low-power, highly-accurate deep neural network accelerators,” in ACM/IEEE International Symposium on Computer Architecture (ISCA), 2016.
  15. Y. Bengio et al., “Estimating or propagating gradients through stochastic neurons for conditional computation,” ArXiv, 2013.
  16. W. Lee et al., “Dynamic thermal management for FinFET-based circuits exploiting the temperature effect inversion phenomenon,” in IEEE/ACM International Symposium on Low Power Electronics and Design (ISLPED), 2014.
  17. N. Pinckney et al., “Impact of FinFET on near-threshold voltage scalability,” IEEE Design & Test, vol. 34, no. 2, 2017.
  18. “15nm Open-Cell library and 45nm freePDK,” https://si2.org/open-cell-library/.
Citations (6)

Summary

We haven't generated a summary for this paper yet.