Memristive In-Memory Computing

Updated 28 November 2025

Memristive IMC is an in-memory computing paradigm that uses memristors in dense crossbar arrays to execute logic and arithmetic, thereby overcoming the memory wall.
It leverages stateful logic and analog operations such as MAGIC gates and vector–matrix multiplication to enable parallel computing with significant energy and speed benefits.
Advanced device modeling, error mitigation strategies, and co-design frameworks ensure reliable performance in applications ranging from deep learning accelerators to hardware security.

Memristive In-Memory Computing (IMC) refers to computational paradigms and hardware architectures in which logic and arithmetic operations are physically performed within the memory array using memristive devices, typically organized as dense crossbar arrays. By collocating storage and computation in the same physical substrate, memristive IMC enables non-von Neumann architectures that can overcome the memory wall, reduce energy consumption, and exploit the intrinsic parallelism of nanoscale resistive switches.

1. Physical Principles and Device Models

Memristors are two-terminal resistive devices whose conductance $G(x)$ is a function of an internal state variable $x$ that encodes the integral of past current or voltage. Device implementations include transition-metal-oxide ReRAM (ECM, VCM, TCM), phase-change memory (PCM), magnetoresistive RAM (MRAM), and emerging all-optically controlled stacks (Mehonic et al., 2020, Yang et al., 2021). The canonical compact model writes the state-dependent resistance as: $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ with $x \in [0,1]$ and the evolution law

$\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$

where $R_{\mathrm{OFF}}$ , $R_{\mathrm{ON}}$ are the high- and low-resistance states, and $i_{\mathrm{th}\pm}$ are threshold currents (Kvatinsky, 2022).

Physical memristor stacks often exhibit stochastic cycle-to-cycle variation, nonlinearity in $\Delta G$ vs. pulse amplitude, as well as drift in analog levels, particularly for PCM (Petropoulos et al., 2020, Mehonic et al., 2020). Multi-level operation (e.g., 16 levels for 4-bit cells) allows compact variable-precision storage (Ajmi et al., 2022, Ajmi et al., 2024), while advanced models account for G–V nonlinearity, log-normal disorder, and parasitic wire effects (Zhou et al., 21 Nov 2025).

2. Crossbar Array Organization and Stateful Logic

The fundamental architectural element of memristive IMC is the crossbar: an $m\times n$ grid in which each intersection hosts a memristive device (1R, 1S1R, or 1T1R). Standard memory reads and writes use voltage drivers as in conventional RRAM. For computation, the same hardware executes logic primitives using “stateful logic” where voltages applied to selected wordlines/bitlines induce controlled changes in device state, encoding Boolean or arithmetic functions (Kvatinsky, 2022, Esmanhotto et al., 2022).

A core primitive is the MAGIC gate, e.g., for a NOR:

Precharge output cell to $x$ 0
Apply $x$ 1 to wordlines of the input devices, ground output wordline
Depending on input states, the output device transitions to $x$ 2 or remains at $x$ 3

Other schemes leverage material implication (IMPLY) logic, with full coverage of universal gates (Leitersdorf et al., 2021, Du et al., 23 Jun 2025). Advanced microarchitectures utilize row/column partitioning, gating transistors for parallel operations, and enhanced scheduling to emulate ILP mappings (SIMPLE/SIMPLER) or generate SAT-optimal synthesis for arbitrary logic (M³S) (Du et al., 23 Jun 2025).

3. Analog In-Memory Computing for Linear Algebra

For vector–matrix multiplication (VMM), Ohm’s and Kirchhoff’s laws permit direct computation of dot-products: $x$ 4 where $x$ 5 is the conductance at $x$ 6 and $x$ 7 is the input voltage. This operation is performed for all output nodes in parallel, implementing a VMM in O(1) array cycles (Petropoulos et al., 2020, Mehonic et al., 2020).

Partitioning strategies (horizontal/vertical splits) are imperative for practical deployments—large monolithic arrays are severely degraded by IR-drop and parasitics, resulting in unacceptably poor SNR and accuracy collapse in DNN inference (Amin et al., 2023, Amin et al., 2022). Tiled subarrays with on-chip analog accumulation support reliability and scalability to large models (Zhou et al., 21 Nov 2025, Amin et al., 2022).

Digital-to-analog (DAC) and analog-to-digital (ADC) peripheries are required for digital-analog boundary conversion, though advances in fully analog neurons (e.g., MRAM-based sigmoid circuits) are eliminating most conversion overhead, yielding >10x energy savings and 10–50x reduction in latency in deep multilevel MLPs (Amin et al., 2022).

4. Performance, Parallelism, and Microarchitectural Acceleration

Memristive IMC's primary advantage is elimination of CPU-memory data shuttling. In a practical mMPU, an $x$ 8 crossbar can execute $x$ 9 $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 0-bit gates per $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 1 cycle; for row length $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 2, this translates to throughput exceeding $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 3 gates/s at picosecond per-gate latencies (Kvatinsky, 2022).

Energy efficiency is extreme compared to von Neumann: $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 4pJ/bit in mMPU logic vs. $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 5– $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 6pJ/bit in CPUs— $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 7 improvement. Matrix-vector multiplication, convolution, and cryptographic kernels (e.g., AES) benefit from parallelism, pipelining, and crossbar-mapped LUT structures. In hardware security, 4-bit multi-level memristor-based AES-IMC achieves $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 830% power reduction and $R(x) = R_{\mathrm{OFF}} - x\,(R_{\mathrm{OFF}} - R_{\mathrm{ON}})$ 962% throughput gain versus CMOS or NVM baselines (Ajmi et al., 2022, Ajmi et al., 2024).

Stateful logic multipliers and advanced partition-based architectures (e.g., MultPIM, MatPIM) enable $x \in [0,1]$ 0 or even $x \in [0,1]$ 1 latency in matrix operations (binary/popcount trees, block-reduction, input-parallel convolution), achieving $x \in [0,1]$ 2 and up to $x \in [0,1]$ 3 speedups over prior art (Leitersdorf et al., 2021, Leitersdorf et al., 2022).

Table: Sample performance metrics (as reported)

Architecture	Latency	Power/Energy	Parallelism/Notes
mMPU MAGIC (1024bit)	$x \in [0,1]$ 4	$x \in [0,1]$ 5 pJ ( $x \in [0,1]$ 6 pJ/bit)	All rows/bitlines in parallel (Kvatinsky, 2022)
MatPIM (binary MVM)	$x \in [0,1]$ 7 cycles	N/A	$x \in [0,1]$ 8 speedup vs prev. (Leitersdorf et al., 2022)
MRAM-analog sigmoid	$x \in [0,1]$ 9 ns	$\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 0 nJ inference	No ADC/DAC in layers, $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 1 TOPS/W (Amin et al., 2022)
AES-IMC (FPGA, 64b)	$\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 2 cycles	$\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 3 mW	$\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 4 power, $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 5 throughput vs. NVM-AES (Ajmi et al., 2022)

5. Device Nonidealities, Reliability, and Error Mitigation

Memristive IMC research has systematically addressed variability (R $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 6/R $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 7 spread, threshold scatter), temporal drift (notably in PCM), noise, and endurance. Key reliability and correction measures include:

In-memory ECC: Diagonal parity codes embedded in the crossbar logic stream, enabling in-place detection/correction with > $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 8 increased mean time between failures (Kvatinsky, 2022).
Full-correction programming (e.g., 5 s wait after SET, reverify): ensures MLC levels remain within narrow windows over $\frac{dx}{dt} = \begin{cases} \alpha\,(i - i_{\mathrm{th}+}), & i>i_{\mathrm{th}+} \ \alpha\,(i + i_{\mathrm{th}-}), & i < -i_{\mathrm{th}-} \ 0, & \text{otherwise} \end{cases}$ 9 cycles and one-month retention (Esmanhotto et al., 2022).
Analog-aware neural retraining, differential-pair weight encoding, and multi-device synapses: combat drift, noise, and device-to-device asymmetry (Mehonic et al., 2020, Petropoulos et al., 2020).
Partitioning: necessary to control IR-drop; for DNNs, $R_{\mathrm{OFF}}$ 0 blocks with CBRAM (R $R_{\mathrm{OFF}}$ 1R $R_{\mathrm{OFF}}$ 2100) yield $R_{\mathrm{OFF}}$ 3 accuracy, while $R_{\mathrm{OFF}}$ 4 arrays can collapse to $R_{\mathrm{OFF}}$ 5 due to parasitics (Amin et al., 2023, Amin et al., 2022).

Optical control (e.g., Au/ZnO/Pt memristors) provides low-energy, non-volatile modulation with high uniformity and no filamentary variation (Yang et al., 2021). Mixed-mode schemes (combined voltage- and memristance-based logic) achieve best-in-class area–delay tradeoffs and >99% reliability in full-adders and S-box units (Du et al., 23 Jun 2025).

6. Simulation, Co-Design, and System Integration

End-to-end, cross-layer simulation and co-design are now supported by frameworks like MemIntelli, IMAC-Sim, and XbarSim:

MemIntelli: Unified device-to-application stack with explicit G–V noise models, crossbar-level Ohmic solving, flexible precision bit-slicing, and NumPy/PyTorch APIs. Enables pre-verification of ML, clustering, wavelet, and equation-solving workloads under device/circuit nonidealities (Zhou et al., 21 Nov 2025).
IMAC-Sim/XbarSim: Circuit-level SPICE or matrix-equation-based simulators that incorporate device resistance, variability, line parasitics, and allow batch/batched MVM pipelines to be explored at full array scale (Amin et al., 2023, Kolinko et al., 2024).
NAX (Neural Architecture and crossbar eXplorer): Joint NAS and crossbar sizing for DNNs, trading off energy, delay, and area under realistic crossbar models; finds layer-heterogeneous configurations that dominate homogeneous baselines in EDAP and resilience (Negi et al., 2021).
Design guidelines: Always co-optimize hardware topology and algorithm parameters; fine-tune subarray size and device parameters (G-levels, $R_{\mathrm{OFF}}$ 6) per layer; always simulate under full noise and parasitic models to avoid catastrophic deployment failures.

7. Applications and Outlook

Memristive IMC architectures are being experimentally and numerically demonstrated in a variety of domains:

Deep learning accelerators: full-precision and binary neural networks, spiking neural nets, on-chip learning, and in situ training (Mehonic et al., 2020, Petropoulos et al., 2020, Amin et al., 2022).
Hardware security: AES engines implemented in-memory with 4-bit MLC memristors, resistive LUT S-boxes, and side-channel resistance (Ajmi et al., 2022, Ajmi et al., 2024).
Logic and vector processors: Memristive vector processors offload core computation, automata processors map state machines, and general logic/S-box execution (AES, adders) is achieved with mixed-mode or stateful logic (Mehonic et al., 2020, Du et al., 23 Jun 2025).
Large-scale ML and signal processing: equation solving, wavelet transforms, data clustering, and analog computing primitives are mapped to bit-sliced, partitioned, and quantized crossbar engines (Zhou et al., 21 Nov 2025).
In-memory data movement: O(1) constant-time word-wise cloning (IMM) doubles the efficiency of data manipulation inside RRAM crossbars (Singh et al., 2024).

Challenges remain in scaling array dimensions (IR-drop control, sneak-paths), achieving reliable, dense multilevel storage, enhancing endurance/retention, and integrating DAC/ADC-free analog compute for true end-to-end energy savings. Emerging directions span all-optical control, neuromorphic in-memory computation, and algorithm–hardware co-adaptation to nonideality profiles (Yang et al., 2021, Mehonic et al., 2020).

Memristive IMC is approaching practical density, power, and performance frontiers for post-von Neumann computation, subject to ongoing advances in materials, circuit, system, and co-design methodologies.