PICO-RAM: A PVT-Insensitive Analog Compute-In-Memory SRAM Macro with In-Situ Multi-Bit Charge Computing and 6T Thin-Cell-Compatible Layout (2407.12829v1)
Abstract: Analog compute-in-memory (CIM) in static random-access memory (SRAM) is promising for accelerating deep learning inference by circumventing the memory wall and exploiting ultra-efficient analog low-precision arithmetic. Latest analog CIM designs attempt bit-parallel schemes for multi-bit analog Matrix-Vector Multiplication (MVM), aiming at higher energy efficiency, throughput, and training simplicity and robustness over conventional bit-serial methods that digitally shift-and-add multiple partial analog computing results. However, bit-parallel operations require more complex analog computations and become more sensitive to well-known analog CIM challenges, including large cell areas, inefficient and inaccurate multi-bit analog operations, and vulnerability to PVT variations. This paper presents PICO-RAM, a PVT-insensitive and compact CIM SRAM macro with charge-domain bit-parallel computation. It adopts a multi-bit thin-cell Multiply-Accumulate (MAC) unit that shares the same transistor layout as the most compact 6T SRAM cell. All analog computing modules, including digital-to-analog converters (DACs), MAC units, analog shift-and-add, and analog-to-digital converters (ADCs) reuse one set of local capacitors inside the array, performing in-situ computation to save area and enhance accuracy. A compact 8.5-bit dual-threshold time-domain ADC power gates the main path most of the time, leading to a significant energy reduction. Our 65-nm prototype achieves the highest weight storage density of 559 Kb/mm${2}$ and exceptional robustness to temperature and voltage variations (-40 to 105 ${\circ}$C and 0.65 to 1.2 V) among SRAM-based analog CIM designs.
- M. Horowitz, “Computing’s Energy Problem (and What We Can Do about It),” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 10–14, 2014.
- N. Verma, H. Jia, H. Valavi, Y. Tang, M. Ozatay, L.-Y. Chen, B. Zhang, and P. Deaville, “In-Memory Computing: Advances and Prospects,” IEEE Solid-State Circuits Magazine, vol. 11, no. 3, pp. 43–55, 2019.
- D. Wang, C.-T. Lin, G. K. Chen, P. Knag, R. K. Krishnamurthy, and M. Seok, “DIMC: 2219TOPS/W 2569F2/b Digital In-Memory Computing Macro in 28nm Based on Approximate Arithmetic Hardware,” in IEEE International Solid-State Circuits Conference (ISSCC), 2022.
- H. Fujiwara, H. Mori, W.-C. Zhao, M.-C. Chuang, R. Naous, C.-K. Chuang, T. Hashizume, D. Sun, C.-F. Lee, K. Akarvardar, S. Adham, T.-L. Chou, M. E. Sinangil, Y. Wang, Y.-D. Chih, Y.-H. Chen, H.-J. Liao, and T.-Y. J. Chang, “A 5-nm 254-TOPS/W 221-TOPS/mm2 Fully-Digital Computing-in-Memory Macro Supporting Wide-Range Dynamic-Voltage-Frequency Scaling and Simultaneous MAC and Write Operations,” in IEEE International Solid- State Circuits Conference (ISSCC), 2022.
- C.-F. Lee, C.-H. Lu, C.-E. Lee, H. Mori, H. Fujiwara, Y.-C. Shih, T.-L. Chou, Y.-D. Chih, and T.-Y. J. Chang, “A 12nm 121-TOPS/W 41.6-TOPS/mm2 All Digital Full Precision SRAM-based Compute-in-Memory with Configurable Bit-width For AI Edge Applications,” in IEEE Symposium on VLSI Technology and Circuits (VLSI), pp. 24–25, 2022.
- J. Lee, J. Kim, W. Jo, S. Kim, S. Kim, and H.-J. Yoo, “ECIM: Exponent Computing in Memory for an Energy-Efficient Heterogeneous Floating-Point DNN Training Processor,” IEEE Micro, vol. 42, no. 1, pp. 99–107, 2022.
- P.-C. Wu, J.-W. Su, L.-Y. Hong, J.-S. Ren, C.-H. Chien, H.-Y. Chen, C.-E. Ke, H.-M. Hsiao, S.-H. Li, S.-S. Sheu, W.-C. Lo, S.-C. Chang, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, and M.-F. Chang, “A 22nm 832Kb Hybrid-Domain Floating-Point SRAM In-Memory-Compute Macro with 16.2-70.2TFLOPS/W for High-Accuracy AI-Edge Devices,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 126–128, 2023.
- G. Jedhe, C. Deshpande, S. Kumar, C.-X. Xue, Z. Guo, R. Garg, K. S. Jway, E.-J. Chang, J. Liang, Z. Wan, and Z. Pan, “A 12nm 137 TOPS/W Digital Compute-In-Memory using Foundry 8T SRAM Bitcell supporting 16 Kernel Weight Sets for AI Edge Applications,” in IEEE Symposium on VLSI Technology and Circuits, 2023.
- A. Guo, X. Si, X. Chen, F. Dong, X. Pu, D. Li, Y. Zhou, L. Ren, Y. Xue, X. Dong, H. Gao, Y. Zhang, J. Zhang, Y. Kong, T. Xiong, B. Wang, H. Cai, W. Shan, and J. Yang, “A 28nm 64-kb 31.6-TFLOPS/W Digital-Domain Floating-Point-Computing-Unit and Double-Bit 6T-SRAM Computing-in-Memory Macro for Floating-Point CNNs,” in 2023 IEEE International Solid-State Circuits Conference (ISSCC), pp. 128–130, 2023.
- Y.-D. Chih, P.-H. Lee, H. Fujiwara, Y.-C. Shih, C.-F. Lee, R. Naous, Y.-L. Chen, C.-P. Lo, C.-H. Lu, H. Mori, W.-C. Zhao, D. Sun, M. E. Sinangil, Y.-H. Chen, T.-L. Chou, K. Akarvardar, H.-J. Liao, Y. Wang, M.-F. Chang, and T.-Y. J. Chang, “An 89 TOPS/W and 16.3TOPS/mm2 all-digital SRAM-based full-precision compute-in memory macro in 22nm for machine-learning edge applications,” in 2021 IEEE International Solid- State Circuits Conference (ISSCC), vol. 64, pp. 252–254, 2021.
- H. Mori, W.-C. Zhao, C.-E. Lee, C.-F. Lee, Y.-H. Hsu, C.-K. Chuang, T. Hashizume, H.-C. Tung, Y.-Y. Liu, S.-R. Wu, K. Akarvardar, T.-L. Chou, H. Fujiwara, Y. Wang, Y.-D. Chih, Y.-H. Chen, H.-J. Liao, and T.-Y. J. Chang, “A 4nm 6163-TOPS/W/b 4790-TOPS/mm2/b SRAM Based Digital-Computing-in-Memory Macro Supporting Bit-Width Flexibility and Simultaneous MAC and Weight Update,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 132–134, 2023.
- H. Kim, T. Yoo, T. T.-H. Kim, and B. Kim, “Colonnade: A Reconfigurable SRAM-Based Digital Bit-Serial Compute-In-Memory Macro for Processing Neural Networks,” IEEE Journal of Solid-State Circuits, vol. 56, no. 7, pp. 2221–2233, 2021.
- Y. He, H. Diao, C. Tang, W. Jia, X. Tang, Y. Wang, J. Yue, X. Li, H. Yang, H. Jia, and Y. Liu, “A 28nm 38-to-102-TOPS/W 8b Multiply-Less Approximate Digital SRAM Compute-In-Memory Macro for Neural-Network Inference,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 130–132, 2023.
- B. Yan, J.-L. Hsu, P.-C. Yu, C.-C. Lee, Y. Zhang, W. Yue, G. Mei, Y. Yang, Y. Yang, H. Li, Y. Chen, and R. Huang, “A 1.041-Mb/MM 2 27.38-TOPS/W signed-INT8 dynamic-logic-based ADC-less SRAM compute-in-memory macro in 28nm with reconfigurable bitwise operation for AI and embedded applications,” in IEEE International Solid-State Circuits Conference (ISSCC), 2022.
- B. Murmann, “Mixed-Signal Computing for Deep Neural Network Inference,” IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 29, no. 1, pp. 3–13, 2021.
- Q. Dong, M. E. Sinangil, B. Erbagci, D. Sun, W.-S. Khwa, H.-J. Liao, Y. Wang, and J. Chang, “A 351TOPS/W and 372.4gops compute-in-memory SRAM Macro in 7nm FinFET CMOS for Machine-Learning Applications,” in IEEE International Solid- State Circuits Conference (ISSCC), pp. 242–244, 2020.
- S. K. Gonugondla, M. Kang, and N. Shanbhag, “A 42pJ/Decision 3.12TOPS/W Robust in-Memory Machine Learning Classifier with on-Chip Training,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 490–492, 2018.
- S. Yin, Z. Jiang, J.-S. Seo, and M. Seok, “XNOR-SRAM: In-Memory Computing SRAM Macro for Binary/Ternary Deep Neural Networks,” IEEE Journal of Solid-State Circuits, vol. 55, no. 6, pp. 1733–1743, 2020.
- S. Okumura, M. Yabuuchi, K. Hijioka, and K. Nose, “A Ternary Based Bit Scalable, 8.80 TOPS/W CNN Accelerator with Many-core Processing-in-memory Architecture with 896K Synapses/mm2,” in IEEE Symposium on VLSI Technology and Circuits, pp. C248–C249, 2019.
- X. Si, Y.-N. Tu, W.-H. Huang, J.-W. Su, P.-J. Lu, J.-H. Wang, T.-W. Liu, S.-Y. Wu, R. Liu, Y.-C. Chou, Z. Zhang, S.-H. Sie, W.-C. Wei, Y.-C. Lo, T.-H. Wen, T.-H. Hsu, Y.-K. Chen, W. Shih, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, N.-C. Lien, W.-C. Shih, Y. He, Q. Li, and M.-F. Chang, “A 28nm 64Kb 6T SRAM Computing-in-Memory Macro with 8b MAC Operation for AI Edge Chips,” in IEEE International Solid- State Circuits Conference (ISSCC), pp. 246–248, 2020.
- X. Si, J.-J. Chen, Y.-N. Tu, W.-H. Huang, J.-H. Wang, Y.-C. Chiu, W.-C. Wei, S.-Y. Wu, X. Sun, R. Liu, S. Yu, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, Q. Li, and M.-F. Chang, “A Twin-8T SRAM Computation-in-Memory Unit-Macro for Multibit CNN-Based AI Edge Processors,” IEEE Journal of Solid-State Circuits, vol. 55, no. 1, pp. 189–202, 2020.
- J. Zhang, Z. Wang, and N. Verma, “In-Memory Computation of a Machine-Learning Classifier in a Standard 6T SRAM Array,” IEEE Journal of Solid-State Circuits, vol. 52, no. 4, pp. 915–924, 2017.
- A. Biswas and A. P. Chandrakasan, “Conv-sram: An energy-efficient sram with in-memory dot-product computation for low-power convolutional neural networks,” IEEE Journal of Solid-State Circuits, vol. 54, no. 1, pp. 217–230, 2019.
- Z. Jiang, S. Yin, J.-S. Seo, and M. Seok, “C3SRAM: An In-Memory-Computing SRAM Macro Based on Robust Capacitive Coupling Computing Mechanism,” IEEE Journal of Solid-State Circuits, vol. 55, no. 7, pp. 1888–1897, 2020.
- H. Jia, H. Valavi, Y. Tang, J. Zhang, and N. Verma, “A Programmable Heterogeneous Microprocessor Based on Bit-Scalable In-Memory Computing,” IEEE Journal of Solid-State Circuits, pp. 2609–2621, 2020.
- H. Jia, M. Ozatay, Y. Tang, H. Valavi, R. Pathak, J. Lee, and N. Verma, “Scalable and Programmable Neural Network Inference Accelerator Based on In-Memory Computing,” IEEE Journal of Solid-State Circuits, pp. 198–211, 2021.
- Z. Chen, Q. Jin, Z. Yu, Y. Wang, and K. Yang, “DCT-RAM: A Driver-Free Process-In-Memory 8T SRAM Macro with Multi-Bit Charge-Domain Computation and Time-Domain Quantization,” in IEEE Custom Integrated Circuits Conference (CICC), 2022.
- Z. Chen, Z. Yu, Q. Jin, Y. He, J. Wang, S. Lin, D. Li, Y. Wang, and K. Yang, “CAP-RAM: A Charge-Domain In-Memory Computing 6T-SRAM for Accurate and Precision-Programmable CNN Inference,” IEEE Journal of Solid-State Circuits, vol. 56, no. 6, pp. 1924–1935, 2021.
- E. Lee, T. Han, D. Seo, G. Shin, J. Kim, S. Kim, S. Jeong, J. Rhe, J. Park, J. H. Ko, and Y. Lee, “A Charge-Domain Scalable-Weight In-Memory Computing Macro With Dual-SRAM Architecture for Precision-Scalable DNN Accelerators,” IEEE Transactions on Circuits and Systems I: Regular Paper, vol. 68, no. 8, pp. 3305–3316, 2021.
- H. Wang, R. Liu, R. Dorrance, D. Dasalukunte, D. Lake, and B. Carlton, “A Charge Domain SRAM Compute-in-Memory Macro With C-2C Ladder-Based 8-Bit MAC Unit in 22-nm FinFET Process for Edge Inference,” IEEE Journal of Solid-State Circuits, vol. 58, no. 4, pp. 1037–1050, 2023.
- S.-E. Hsieh, C.-H. Wei, C.-X. Xue, H.-W. Lin, W.-H. Tu, E.-J. Chang, K.-T. Yang, P.-H. Chen, W.-N. Liao, L. L. Low, C.-D. Lee, A.-C. Lu, J. Liang, C.-C. Cheng, and T.-H. Kang, “A 70.85-86.27TOPS/W PVT-Insensitive 8b Word-Wise ACIM with Post-Processing Relaxation,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 136–138, 2023.
- J.-W. Su, Y.-C. Chou, R. Liu, T.-W. Liu, P.-J. Lu, P.-C. Wu, Y.-L. Chung, L.-Y. Hung, J.-S. Ren, T. Pan, S.-H. Li, S.-C. Chang, S.-S. Sheu, W.-C. Lo, C.-I. Wu, X. Si, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, and M.-F. Chang, “A 28nm 384kb 6T-SRAM Computation-in-Memory Macro with 8b Precision for AI Edge Chips,” in IEEE International Solid- State Circuits Conference (ISSCC), pp. 250–252, 2021.
- Y.-T. Hsu, C.-Y. Yao, T.-Y. Wu, T.-D. Chiueh, and T.-T. Liu, “A High-Throughput Energy–Area-Efficient Computing-in-Memory SRAM Using Unified Charge-Processing Network,” IEEE Solid-State Circuits Letters, vol. 4, pp. 146–149, 2021.
- H. Valavi, P. J. Ramadge, E. Nestler, and N. Verma, “A 64-Tile 2.4-Mb In-Memory-Computing CNN Accelerator Employing Charge-Domain Compute,” IEEE Journal of Solid-State Circuits, vol. 54, no. 6, pp. 1789–1799, 2019.
- P. Chen, M. Wu, W. Zhao, J. Cui, Z. Wang, Y. Zhang, Q. Wang, J. Ru, L. Shen, T. Jia, Y. Ma, L. Ye, and R. Huang, “A 22nm Delta-Sigma Computing-In-Memory (ΔΣΔΣ\Delta\Sigmaroman_Δ roman_ΣCIM) SRAM Macro with Near-Zero-Mean Outputs and LSB-First ADCs Achieving 21.38TOPS/W for 8b-MAC Edge AI Processing,” in IEEE International Solid-State Circuits Conference (ISSCC), 2023.
- J.-W. Su, Y.-C. Chou, R. Liu, T.-W. Liu, P.-J. Lu, P.-C. Wu, Y.-L. Chung, L.-Y. Hong, J.-S. Ren, T. Pan, C.-J. Jhang, W.-H. Huang, C.-H. Chien, P.-I. Mei, S.-H. Li, S.-S. Sheu, S.-C. Chang, W.-C. Lo, C.-I. Wu, X. Si, C.-C. Lo, R.-S. Liu, C.-C. Hsieh, K.-T. Tang, and M.-F. Chang, “A 8-b-Precision 6T SRAM Computing-in-Memory Macro Using Segmented-Bitline Charge-Sharing Scheme for AI Edge Chips,” IEEE Journal of Solid-State Circuits, vol. 58, no. 3, pp. 877–892, 2023.
- Q. Jin, Z. Chen, J. Ren, Y. Li, Y. Wang, and K. Yang, “PIM-QAT: Neural network quantization for processing-in-memory (PIM) systems,” arXiv preprint arXiv:2209.08617, 2022.
- J. Lee, H. Valavi, Y. Tang, and N. Verma, “Fully Row/Column-Parallel In-memory Computing SRAM Macro employing Capacitor-based Mixed-signal Computation with 5-b Inputs,” in IEEE Symposium on VLSI Technology and Circuits, 2021.
- D. Lin, S. Talathi, and S. Annapureddy, “Fixed Point Quantization of Deep Convolutional Networks,” in International Conference on Machine Learning (ICML), pp. 2849–2858, 2016.
- S. K. Gonugondla, C. Sakr, H. Dbouk, and N. R. Shanbhag, “Fundamental Limits on the Precision of In-Memory Architectures,” in Proceedings of the 39th International Conference on Computer-Aided Design, pp. 1–9, 2020.
- R. H. Walden, “Analog-to-digital converter survey and analysis,” IEEE Journal on Selected Areas in Communications, vol. 17, no. 4, pp. 539–550, 1999.
- W.-C. Wei, C.-J. Jhang, Y.-R. Chen, C.-X. Xue, S.-H. Sie, J.-L. Lee, H.-W. Kuo, C.-C. Lu, M.-F. Chang, and K.-T. Tang, “A Relaxed Quantization Training Method for Hardware Limitations of Resistive Random Access Memory (ReRAM)-Based Computing-in-Memory,” IEEE Journal on Exploratory Solid-State Computational Devices and Circuits, vol. 6, no. 1, pp. 45–52, 2020.
- B. Zhang, C.-Y. Chen, and N. Verma, “Reshape and Adapt for Output Quantization (RAOQ): Quantization-aware Training for In-memory Computing Systems,” https://openreview.net/pdf?id=r5sikTJ94y, 2023.
- S. Zhou, Y. Wu, Z. Ni, X. Zhou, H. Wen, and Y. Zou, “DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients,” arXiv preprint arXiv:1606.06160, 2016.
- J. Kim, K. Lee, and J. Park, “A Charge Domain P-8T SRAM Compute-In-Memory with Low-Cost DAC/ADC Operation for 4-bit Input Processing,” in Proceedings of the ACM/IEEE International Symposium on Low Power Electronics and Design, 2022.
- B. Wang et al., “A 28nm Horizontal-Weight-Shift and Vertical-feature-Shift-Based Separate-WL 6T-SRAM Computation-in-Memory Unit-Macro for Edge Depthwise Neural-Networks,” in IEEE International Solid-State Circuits Conference (ISSCC), pp. 134–136, 2023.
- A. Coucke, M. Chlieh, T. Gisselbrecht, D. Leroy, M. Poumeyrol, and T. Lavril, “Efficient Keyword Spotting Using Dilated Convolutions and Gating,” in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6351–6355, 2019.
- J. Yue et al., “An Energy-Efficient Computing-in-Memory NN Processor With Set-Associate Blockwise Sparsity and Ping-Pong Weight Update,” IEEE Journal of Solid-State Circuits, 2023.
- J. Yue, Y. Liu, Z. Yuan, X. Feng, Y. He, W. Sun, Z. Zhang, X. Si, R. Liu, Z. Wang, M.-F. Chang, C. Dou, X. Li, M. Liu, and H. Yang, “STICKER-IM: A 65 nm Computing-in-Memory NN Processor Using Block-Wise Sparsity Optimization and Inter/Intra-Macro Data Reuse,” IEEE Journal of Solid-State Circuits, vol. 57, no. 8, pp. 2560–2573, 2022.
- R. Sehgal, R. Mehra, C. Ni, and J. P. Kulkarni, “Compute-MLROM: Compute-in-Multi Level Read Only Memory for Energy Efficient Edge AI Inference Engines,” in European Solid State Circuits Conference (ESSCIRC), pp. 37–40, 2023.
- Y. Wang, S. Xie, J. Rohan, M. Wang, M. Yang, S. Oruganti, and J. P. Kulkarni, “A GNN Computing-in-Memory Macro and Accelerator with Analog-Digital Hybrid Transformation and CAMenabled Search-reduce,” in Custom Integrated Circuits Conference (CICC), 2023.
- Zhiyu Chen (60 papers)
- Ziyuan Wen (2 papers)
- Weier Wan (2 papers)
- Akhil Reddy Pakala (1 paper)
- Yiwei Zou (4 papers)
- Wei-Chen Wei (1 paper)
- Zengyi Li (7 papers)
- Yubei Chen (32 papers)
- Kaiyuan Yang (32 papers)