Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
144 tokens/sec
GPT-4o
7 tokens/sec
Gemini 2.5 Pro Pro
46 tokens/sec
o3 Pro
4 tokens/sec
GPT-4.1 Pro
38 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

Self-Calibrated Mixed-Signal CIM Accelerator SoC

Updated 30 June 2025
  • Self-calibrated mixed-signal CIM accelerator SoC integrates dense analog compute arrays, digital control, and on-chip calibration to ensure robust, low-power AI processing.
  • On-chip self-test and automated calibration correct analog non-idealities, improving SNR by up to 6–8 dB and enhancing DNN inference accuracy.
  • The unified test infrastructure and scalable design enable efficient co-design and adaptive performance for energy-critical edge AI and IoT applications.

A self-calibrated mixed-signal computing-in-memory (CIM) accelerator system-on-chip (SoC) is an integrated platform that combines dense analog/mixed-signal compute arrays (typically SRAM- or resistive-based), digital control logic, and on-chip calibration circuitry to deliver high efficiency, reliability, and scalability for AI and signal processing tasks. Such SoCs are architected to support robust, low-power operation despite analog non-idealities, and intrinsically support self-test and self-calibration routines for sustained accuracy and system health.

1. Architectural Principles of Self-Calibrated Mixed-Signal CIM SoCs

Self-calibrated mixed-signal CIM SoCs exploit analog or mixed-signal arrays to realize parallel multiply-and-accumulate operations in memory, significantly reducing data movement and energy per inference. The analog compute core is commonly built using SRAM-based, resistive eNVM-based, or hybrid MDAC-weighted cells organized in 2D arrays with peripheral DACs, S/H buffers, and ADCs for digital-analog bidirectional interfacing. Digital sub-systems (e.g., embedded RISC-V CPUs) are tightly coupled to exercise test, calibration, data movement, and host-side computation logic (2506.15440).

A defining element is the presence of autonomous on-chip calibration circuits and routines. Calibration can be initiated on-chip under processor control, adjusts column/row-level analog errors, and is informed by built-in self-test vectors and per-column ADC readouts. The overall system is structured such that both digital and analog sub-blocks are accessible and schedulable through unified, digital configuration and control pathways, often facilitated by a Test Access Mechanism (TAM) and test wrappers for analog sub-blocks (0710.4686).

2. Unified Test and Calibration Methodology

Test and calibration of both analog and digital cores are unified under a digital-accessible infrastructure:

  • Analog Test Wrappers: Each analog core is "wrapped" with on-chip ADCs/DACs and digital encoding/decoding logic, enabling digital test patterns to be applied, analog signals digitized, and further observed and processed through conventional TAMs (0710.4686).
  • Self-Test Mode Support: The wrapper and digital control logic provide a "self-test" mode whereby the CIM core or analog sub-block can be tested and calibrated in situ, under programmatic control, without reliance on external analog test equipment.
  • Automated Test Planning: Test and calibration routines are co-optimized for area overhead and total test time. Wrapper sharing (time-multiplexed wrappers across compatible analog cores) and resource-optimized TAM scheduling allow SoC-wide test/calibration to scale efficiently—beneficial for large CIM SoCs with many analog compute blocks.
  • Transistor-Level Validation: Implemented test wrappers achieve <5% maximum error versus direct analog observation in typical low-mid frequency analog blocks, while occupying less than one-eighth of a typical analog core’s area when fabricated in 0.12–0.5μm technologies (0710.4686).

3. On-Chip Calibration: Algorithms and Circuit Mechanisms

The critical function of self-calibration is to suppress systematic analog non-idealities—gain/offset mismatches, device/process variation, and line parasitics—down to a level where DNN inference accuracy is minimally impacted.

  • Column/Row-wise Calibration: Gain and offset in the analog readout chain are characterized per column or per peripheral slice via known input stimulus. For example, Numan et al. (2506.15440) perform least-squares fitting between observed and ideal outputs to extract composite gain and offset for each column:

g^tot=Z(QnomQ^act)QnomQ^actZQnom2(Qnom)2\hat{g}_{tot} = \frac{Z \sum (Q_{nom} \hat{Q}_{act}) - \sum Q_{nom} \sum \hat{Q}_{act}}{Z \sum Q_{nom}^2 - (\sum Q_{nom})^2}

ϵ^tot=Q^actg^totQnomZ\hat{\epsilon}_{tot} = \frac{\sum \hat{Q}_{act} - \hat{g}_{tot} \sum Q_{nom}}{Z}

These are then digitally or analogically compensated by tuning per-column resistors and bias DACs.

  • Built-In Self-Calibration (BISC): Calibration steps (characterization and correction) are run under embedded processor control, initiated at power-on, periodically, or when drift is detected. The hardware is designed such that the calibration can span the analog core with minimal performance impact, and results in SNR improvement per column of 6–8 dB, with final SNR values of 18–24 dB—adequate for ~5% DNN inference accuracy loss compared to fixed-point (2506.15440).
  • Resilience to Environmental/Process Variation: Automatic routines allow recalibration for temperature, voltage, and aging, a requirement for energy-constraint edge AI deployments.

4. Integration Strategies and Open-Source Stack

  • Processor-Accelerator Coupling: The self-calibrated CIM SoC integrates a standard or custom RISC-V core (or similar), which issues test vectors, reads back ADC data, computes least-squares calibration coefficients, and applies hardware-level corrections through memory-mapped registers or analog trims (2506.15440).
  • Open-Source Simulation and Co-Testing: The system offers a unified open-source hardware/software infrastructure, where both pre- and post-silicon testing use the same memory-mapped AXI4-Lite interface. Co-simulation frameworks such as cocotb enable end-to-end testing of calibration across simulation and hardware deployment.
  • Programmability and Full-Stack Co-Design: The calibration-aware hardware is accompanied by software APIs and meta-operators to trigger self-test/self-calibration phases, exposing recalibration hooks within DNN workloads, and enabling model-to-chip validation and adaptation.

5. Adaptability to Technology Advances and Dense Integration

  • SRAM-Based vs Resistive/HDLR Arrays: While the proposed system demonstrates efficient calibration for SRAM+MDAC-based mixed-signal arrays, architectural primitives are designed to be adaptable to higher-density post-processed resistors (e.g., Mega-Ohm Resistors, WOₓ, RRAM) that offer 14×–70× area/power reduction over traditional polysilicon resistors. This enables scaling to larger CIM arrays (e.g., 128×128), higher throughput, and higher integration density (2506.15440).
  • Generality of Calibration Framework: The BISC logic is independent of memory/resistor material; it remains applicable for alternative analog arrays that support per-block analog gain and offset tuning.

6. Experimental Results and Impact on DNN Inference

  • Accuracy and SNR: MNIST MLP inference accuracy on hardware is boosted from 88.7% (uncalibrated) to 92.33% (with BISC), closely matching the simulated baseline of 94.23%. Compute SNR across the array improves from ~12–16 dB (uncalibrated) to 18–24 dB post-calibration—within the range established as sufficient for edge DNN workloads.
  • Energy and Area Efficiency: Macro-level throughput reaches 113 GOPS (1b), 0.155 TOPS/mm², and 6.65 TOPS/W (macro only) in 22-nm FD-SOI. With high-density linear resistors, area and energy per operation are projected to improve by over an order of magnitude.
  • Calibration Overhead: The BISC procedure is low-latency, negligible in energy, and can be executed during idle/maintenance cycles. Calibration infrastructure is digitally controlled, parameterizable, and extensible.

7. Relevance and Outlook for Edge AI Systems

Self-calibrated mixed-signal CIM SoCs provide a viable pathway for accurate, reliable, and scalable analog computing in real-world AI systems. By equipping analog cores with digital test/calibration access, automated on-chip measurement and correction, and open-source control infrastructure, these SoCs overcome the primary barriers of analog non-ideality and process drift, enabling their adoption in energy-critical, autonomous edge AI and IoT devices. The system is extensible to emerging resistor and memory technologies and forms a foundation for adaptive, heterogeneous AI SoCs with fine-grained, in-field tuning capabilities.