Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
125 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
42 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

EncodingNet: A Novel Encoding-based MAC Design for Efficient Neural Network Acceleration (2402.18595v2)

Published 25 Feb 2024 in cs.AR, cs.CE, and cs.LG

Abstract: Deep neural networks (DNNs) have achieved great breakthroughs in many fields such as image classification and natural language processing. However, the execution of DNNs needs to conduct massive numbers of multiply-accumulate (MAC) operations on hardware and thus incurs a large power consumption. To address this challenge, we propose a novel digital MAC design based on encoding. In this new design, the multipliers are replaced by simple logic gates to represent the results with a wide bit representation. The outputs of the new multipliers are added by bit-wise weighted accumulation and the accumulation results are compatible with existing computing platforms accelerating neural networks. Since the multiplication function is replaced by a simple logic representation, the critical paths in the resulting circuits become much shorter. Correspondingly, pipelining stages and intermediate registers used to store partial sums in the MAC array can be reduced, leading to a significantly smaller area as well as better power efficiency. The proposed design has been synthesized and verified by ResNet18- Cifar10, ResNet20-Cifar100, ResNet50-ImageNet, MobileNetV2-Cifar10, MobileNetV2-Cifar100, and EfficientNetB0-ImageNet. The experimental results confirmed the reduction of circuit area by up to 48.79% and the reduction of power consumption of executing DNNs by up to 64.41%, while the accuracy of the neural networks can still be well maintained. The open source code of this work can be found on GitHub with link https://github.com/Bo-Liu-TUM/EncodingNet/.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (30)
  1. https://openai.com/blog/chatgpt.
  2. T. Brown et al., “Language models are few-shot learners,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 33, 2020, pp. 1877–1901.
  3. D. Patterson et al., “The carbon footprint of machine learning training will plateau, then shrink,” Computer, vol. 55, no. 7, pp. 18–28, 2022.
  4. https://www.eia.gov/tools/faqs/faq.php?id=97&t=3.
  5. N. P. Jouppi et al., “In-datacenter performance analysis of a tensor processing unit,” in International Symposium on Computer Architecture (ISCA), 2017.
  6. Y. Chen, T. Chen, Z. Xu, N. Sun, and O. Temam, “Diannao family: Energy-efficient hardware accelerators for machine learning,” Communications of ACM, vol. 59, no. 11, p. 105–112, 2016.
  7. Y.-H. Chen, T. Krishna, J. S. Emer, and V. Sze, “Eyeriss: An energy-efficient reconfigurable accelerator for deep convolutional neural networks,” IEEE Journal of Solid-State Circuits (JSSCC), vol. 52, no. 1, pp. 127–138, 2017.
  8. S. Han, H. Mao, and W. J. Dally, “Deep compression: Compressing deep neural network with pruning, trained quantization and huffman coding,” in International Conference on Learning Representations (ICLR), 2016.
  9. T. Liang, J. Glossner, L. Wang, S. Shi, and X. Zhang, “Pruning and quantization for deep neural network acceleration: A survey,” Neurocomputing, vol. 461, pp. 370–403, 2021.
  10. M. Jiang, J. Wang, A. Eldebiky, X. Yin, C. Zhuo, I.-C. Lin, and G. L. Zhang, “Class-aware pruning for efficient neural networks,” in Design, Automation and Test in Europe (DATE), 2024.
  11. R. Petri, G. L. Zhang, Y. Chen, U. Schlichtmann, and B. Li, “Powerpruning: Selecting weights and activations for power-efficient neural network acceleration,” in Design Automation Conference (DAC), 2023.
  12. G. Hinton, O. Vinyals, and J. Dean, “Distilling the knowledge in a neural network,” Neural Information Processing Systems (NeurIPS), 2014.
  13. Y. Han, G. Huang, S. Song, L. Yang, H. Wang, and Y. Wang, “Dynamic neural networks: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), vol. 44, pp. 7436–7456, 2021.
  14. J. Wang, B. Li, and G. L. Zhang, “Early-exit with class exclusion for efficient inference of neural networks,” in International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2024.
  15. M. Wistuba, A. Rawat, and T. Pedapati, “A survey on neural architecture search,” ArXiv, 2019.
  16. G. Armeniakos, G. Zervakis, D. J. Soudris, and J. Henkel, “Hardware approximate techniques for deep neural network accelerators: A survey,” ACM Computing Surveys, vol. 55, pp. 1–36, 2022.
  17. A. Gholami, S. Kim, Z. Dong, Z. Yao, M. W. Mahoney, and K. Keutzer, “A survey of quantization methods for efficient neural network inference,” ArXiv, 2021.
  18. W. Sun, G. L. Zhang, H. Gu, B. Li, and U. Schlichtmann, “Class-based quantization for neural networks,” in Design, Automation and Test in Europe (DATE), 2023.
  19. N. D. Gundi, T. Shabanian, P. Basu, P. Pandey, S. Roy, K. Chakraborty, and Z. Zhang, “Effort: Enhancing energy efficiency and error resilience of a near-threshold tensor processing unit,” in Asia and South Pacific Design Automation Conference (ASP-DAC), 2020, pp. 241–246.
  20. P. Pandey, N. D. Gundi, K. Chakraborty, and S. Roy, “Uptpu: Improving energy efficiency of a tensor processing unit through underutilization based power-gating,” in Design Automation Conference (DAC), 2021, pp. 325–330.
  21. B. Moons, R. Uytterhoeven, W. Dehaene, and M. Verhelst, “14.5 envision: A 0.26-to-10tops/w subword-parallel dynamic-voltage-accuracy-frequency-scalable convolutional neural network processor in 28nm fdsoi,” in International Solid-State Circuits Conference (ISSCC), 2017, pp. 246–247.
  22. J. Nunez-Yanez, “Energy proportional neural network inference with adaptive voltage and frequency scaling,” IEEE Transactions on Computers (TC), vol. 68, no. 5, pp. 676–687, 2019.
  23. D. Miyashita, E. H. Lee, and B. Murmann, “Convolutional neural networks using logarithmic data representation,” arXiv, 2016.
  24. M. Valueva, N. Nagornov, P. Lyakhov, G. Valuev, and N. Chervyakov, “Application of the residue number system to reduce hardware costs of the convolutional neural network implementation,” Mathematics and Computers in Simulation, vol. 177, pp. 232–243, 2020.
  25. Y. Bengio, N. Léonard, and A. Courville, “Estimating or propagating gradients through stochastic neurons for conditional computation,” arXiv, 2013.
  26. M. Cho, K. Alizadeh-Vahid, S. Adya, and M. Rastegari, “Dkm: Differentiable k-means clustering layer for neural network compression,” in International Conference on Learning Representations (ICLR), 2021.
  27. “15nm Open-Cell library and 45nm freePDK,” https://si2.org/open-cell-library/.
  28. “Pretrained ImageNet models,” https://pytorch.org/vision/stable/models.html.
  29. “Pretrained Cifar10 models,” https://github.com/huyvnphan/PyTorch_CIFAR10.
  30. “Pretrained Cifar100 models,” https://github.com/weiaicunzai/pytorch-cifar100.

Summary

We haven't generated a summary for this paper yet.