Analog or Digital In-memory Computing? Benchmarking through Quantitative Modeling (2405.14978v1)
Abstract: In-Memory Computing (IMC) has emerged as a promising paradigm for energy-efficient, throughput-efficient and area-efficient machine learning at the edge. However, the differences in hardware architectures, array dimensions, and fabrication technologies among published IMC realizations have made it difficult to grasp their relative strengths. Moreover, previous studies have primarily focused on exploring and benchmarking the peak performance of a single IMC macro rather than full system performance on real workloads. This paper aims to address the lack of a quantitative comparison of Analog In-Memory Computing (AIMC) and Digital In-Memory Computing (DIMC) processor architectures. We propose an analytical IMC performance model that is validated against published implementations and integrated into a system-level exploration framework for comprehensive performance assessments on different workloads with varying IMC configurations. Our experiments show that while DIMC generally has higher computational density than AIMC, AIMC with large macro sizes may have better energy efficiency than DIMC on convolutional-layers and pointwise-layers, which can exploit high spatial unrolling. On the other hand, DIMC with small macro size outperforms AIMC on depthwise-layers, which feature limited spatial unrolling opportunities inside a macro.
- B. Murmann, “Mixed-signal computing for deep neural network inference,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 1, pp. 3–13, 2021.
- I. A. Papistas, S. Cosemans, B. Rooseleer, J. Doevenspeck, M.-H. Na, A. Mallik, P. Debacker, and D. Verkest, “A 22 nm, 1540 top/s/w, 12.1 top/s/mm2 in-memory analog matrix-vector-multiplier for dnn acceleration,” in 2021 IEEE Custom Integrated Circuits Conference (CICC), 2021, pp. 1–2.
- J.-W. Su, Y.-C. Chou, R. Liu, T.-W. Liu, P.-J. Lu, P.-C. Wu, Y.-L. Chung, L.-Y. Hong, J.-S. Ren, T. Pan, C.-J. Jhang, W.-H. Huang, C.-H. Chien, P.-I. Mei, S.-H. Li, S.-S. Sheu, S.-C. Chang, W.-C. Lo, C.-I. Wu, X. Si, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, and M.-F. Chang, “A 8-b-precision 6t sram computing-in-memory macro using segmented-bitline charge-sharing scheme for ai edge chips,” IEEE Journal of Solid-State Circuits, vol. 58, no. 3, pp. 877–892, 2023.
- P. Chen, M. Wu, W. Zhao, J. Cui, Z. Wang, Y. Zhang, Q. Wang, J. Ru, L. Shen, T. Jia, Y. Ma, L. Ye, and R. Huang, “7.8 a 22nm delta-sigma computing-in-memory (δσ𝛿𝜎\delta\sigmaitalic_δ italic_σcim) sram macro with near-zero-mean outputs and lsb-first adcs achieving 21.38tops/w for 8b-mac edge ai processing,” in 2023 IEEE International Solid- State Circuits Conference (ISSCC), 2023, pp. 140–142.
- F. Tu, Y. Wang, Z. Wu, L. Liang, Y. Ding, B. Kim, L. Liu, S. Wei, Y. Xie, and S. Yin, “A 28nm 29.2tflops/w bf16 and 36.5tops/w int8 reconfigurable digital cim processor with unified fp/int pipeline and bitwise in-memory booth multiplication for cloud deep learning acceleration,” in 2022 IEEE International Solid- State Circuits Conference (ISSCC), vol. 65, 2022, pp. 1–3.
- B. Yan, J.-L. Hsu, P.-C. Yu, C.-C. Lee, Y. Zhang, W. Yue, G. Mei, Y. Yang, Y. Yang, H. Li, Y. Chen, and R. Huang, “A 1.041-mb/mm2 27.38-tops/w signed-int8 dynamic-logic-based adc-less sram compute-in-memory macro in 28nm with reconfigurable bitwise operation for ai and embedded applications,” in 2022 IEEE International Solid- State Circuits Conference (ISSCC), vol. 65, 2022, pp. 188–190.
- A. Guo, X. Si, X. Chen, F. Dong, X. Pu, D. Li, Y. Zhou, L. Ren, Y. Xue, X. Dong, H. Gao, Y. Zhang, J. Zhang, Y. Kong, T. Xiong, B. Wang, H. Cai, W. Shan, and J. Yang, “A 28nm 64-kb 31.6-tflops/w digital-domain floating-point-computing-unit and double-bit 6t-sram computing-in-memory macro for floating-point cnns,” in 2023 IEEE International Solid- State Circuits Conference (ISSCC), 2023, pp. 128–130.
- J. Yue, C. He, Z. Wang, Z. Cong, Y. He, M. Zhou, W. Sun, X. Li, C. Dou, F. Zhang, H. Yang, Y. Liu, and M. Liu, “A 28nm 16.9-300tops/w computing-in-memory processor supporting floating-point nn inference/training with intensive-cim sparse-digital architecture,” in 2023 IEEE International Solid- State Circuits Conference (ISSCC), 2023, pp. 1–3.
- L. Mei, P. Houshmand, V. Jain, S. Giraldo, and M. Verhelst, “Zigzag: Enlarging joint architecture-mapping design space exploration for dnn accelerators,” IEEE Transactions on Computers, vol. 70, no. 8, pp. 1160–1174, Aug 2021.
- N. R. Shanbhag and S. K. Roy, “Comprehending in-memory computing trends via proper benchmarking,” in 2022 IEEE Custom Integrated Circuits Conference (CICC), 2022, pp. 01–07.
- R. Sehgal and J. P. Kulkarni, “Trends in analog and digital intensive compute-in-sram designs,” in 2021 IEEE 3rd International Conference on Artificial Intelligence Circuits and Systems (AICAS), 2021, pp. 1–4.
- G. W. Burr, S. Lim, B. Murmann, R. Venkatesan, and M. Verhelst, “Fair and comprehensive benchmarking of machine learning processing chips,” IEEE Design & Test, vol. 39, no. 3, pp. 18–27, 2022.
- S. Rai, M. Liu, A. Gebregiorgis, D. Bhattacharjee, K. Chakrabarty, S. Hamdioui, A. Chattopadhyay, J. Trommer, and A. Kumar, “Perspectives on emerging computation-in-memory paradigms,” in 2021 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2021, pp. 1925–1934.
- B. Taylor, Q. Zheng, Z. Li, S. Li, and Y. Chen, “Processing-in-memory technology for machine learning: From basic to asic,” IEEE Transactions on Circuits and Systems II: Express Briefs, vol. 69, no. 6, pp. 2598–2603, 2022.
- H. Kim, Y. Kim, S. Ryu, and J.-J. Kim, “Algorithm/hardware co-design for in-memory neural network computing with minimal peripheral circuit overhead,” in 2020 57th ACM/IEEE Design Automation Conference (DAC), 2020, pp. 1–6.
- B. Patrick, R. Guy, R. Nir, P. Bruno, H. Edward, and C. Yiran, “Analog, in-memory compute architectures for artificial intelligence,” cs, vol. cs/2302.06417, 2023, arXiv: 2302.06417. [Online]. Available: https://arxiv.org/abs/2302.06417
- G. Fernando, B. Ali, V. Kanishkan, C. Henk, and D. Shidhartha, “Saca: System-level analog cim accelerators simulation framework: Accurate simulation of non-ideal components,” in 2022 37th Conference on Design of Circuits and Integrated Circuits (DCIS), 2022, pp. 01–06.
- D. Bhattacharjee, N. Laubeuf, S. Cosemans, I. Papistas, A. Maliik, P. Debacker, M. H. Na, and D. Verkest, “Design-technology space exploration for energy efficient aimc-based inference acceleration,” in 2021 IEEE International Symposium on Circuits and Systems (ISCAS), 2021, pp. 1–5.
- S. Spetalnick and A. Raychowdhury, “A practical design-space analysis of compute-in-memory with sram,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 69, no. 4, pp. 1466–1479, 2022.
- C.-J. Jhang, C.-X. Xue, J.-M. Hung, F.-C. Chang, and M.-F. Chang, “Challenges and trends of sram-based computing-in-memory for ai edge devices,” IEEE Transactions on Circuits and Systems I: Regular Papers, vol. 68, no. 5, pp. 1773–1786, 2021.
- U. Saxena, I. Chakraborty, and K. Roy, “Towards adc-less compute-in-memory accelerators for energy efficient deep learning,” in 2022 Design, Automation & Test in Europe Conference & Exhibition (DATE), 2022, pp. 624–627.
- S. Yang, D. Bhattacharjee, V. B. Y. Kumar, S. Chatterjee, S. De, P. Debacker, D. Verkest, A. Mallik, and F. Catthoor, “Aero: Design space exploration framework for resource-constrained cnn mapping on tile-based accelerators,” IEEE Journal on Emerging and Selected Topics in Circuits and Systems, vol. 12, no. 2, pp. 508–521, 2022.
- X. Peng, R. Liu, and S. Yu, “Optimizing weight mapping and data flow for convolutional neural networks on rram based processing-in-memory architecture,” in 2019 IEEE International Symposium on Circuits and Systems (ISCAS), 2019, pp. 1–5.
- A. Parashar, P. Raina, Y. S. Shao, Y.-H. Chen, V. A. Ying, A. Mukkara, R. Venkatesan, B. Khailany, S. W. Keckler, and J. Emer, “Timeloop: A systematic approach to dnn accelerator evaluation,” in 2019 IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS), 2019, pp. 304–315.
- H. Kwon, P. Chatarasi, V. Sarkar, T. Krishna, M. Pellauer, and A. Parashar, “Maestro: A data-centric approach to understand reuse, performance, and hardware cost of dnn mappings,” IEEE Micro, vol. 40, no. 3, pp. 20–29, 2020.
- R. Balasubramonian, A. B. Kahng, N. Muralimanohar, A. Shafiee, and V. Srinivas, “Cacti 7: New tools for interconnect exploration in innovative off-chip memories,” ACM Trans. Archit. Code Optim., vol. 14, no. 2, jun 2017. [Online]. Available: https://doi.org/10.1145/3085572
- P.-C. Wu, J.-W. Su, Y.-L. Chung, L.-Y. Hong, J.-S. Ren, F.-C. Chang, Y. Wu, H.-Y. Chen, C.-H. Lin, H.-M. Hsiao, S.-H. Li, S.-S. Sheu, S.-C. Chang, W.-C. Lo, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, C.-I. Wu, and M.-F. Chang, “A 28nm 1mb time-domain computing-in-memory 6t-sram macro with a 6.6ns latency, 1241gops and 37.01tops/w for 8b-mac operations for edge-ai devices,” in 2022 IEEE International Solid- State Circuits Conference (ISSCC), vol. 65, 2022, pp. 1–3.
- B. Wang, C. Xue, Z. Feng, Z. Zhang, H. Liu, L. Ren, X. Li, A. Yin, T. Xiong, Y. Xue, S. He, Y. Kong, Y. Zhou, A. Guo, X. Si, and J. Yang, “A 28nm horizontal-weight-shift and vertical-feature-shift-based separate-wl 6t-sram computation-in-memory unit-macro for edge depthwise neural-networks,” in 2023 IEEE International Solid- State Circuits Conference (ISSCC), 2023, pp. 134–136.
- B. Murmann, “Adc performance survey 1997-2022.” [Online]. Available: https://web.stanford.edu/ murmann/adcsurvey.html
- C. Banbury, V. J. Reddi, P. Torelli, J. Holleman, N. Jeffries, C. Kiraly, P. Montino, D. Kanter, S. Ahmed, D. Pau et al., “Mlperf tiny benchmark,” Proceedings of the Neural Information Processing Systems Track on Datasets and Benchmarks, 2021.
- V. Sze, Y.-H. Chen, T.-J. Yang, and J. S. Emer, “How to evaluate deep neural network processors: Tops/w (alone) considered harmful,” IEEE Solid-State Circuits Magazine, vol. 12, no. 3, pp. 28–41, 2020.
- “Memory energy,” https://my.eng.utah.edu/ cs7810/pres/14-7810-02.pdf.
- P. Houshmand, G. M. Sarda, V. Jain, K. Ueyoshi, I. A. Papistas, M. Shi, Q. Zheng, D. Bhattacharjee, A. Mallik, P. Debacker, D. Verkest, and M. Verhelst, “Diana: An end-to-end hybrid digital and analog neural network soc for the edge,” IEEE Journal of Solid-State Circuits, vol. 58, no. 1, pp. 203–215, 2023.
- Jiacong Sun (3 papers)
- Pouya Houshmand (5 papers)
- Marian Verhelst (45 papers)