IzhiRISC-V: RISC-V for Neuromorphic SNNs

Updated 16 December 2025

IzhiRISC-V is a RISC-V-compliant processor architecture featuring a custom instruction set for efficient spiking neural network computation based on the Izhikevich neuron model.
It integrates a Neuron Processing Unit (NPU) and a Decay Unit (DCU) directly into the pipeline to achieve single-cycle state updates and synaptic decay with fixed-point arithmetic.
Performance benchmarks show enhanced compute density and energy efficiency, demonstrating effective multi-core scaling and minimized pipeline hazards.

IzhiRISC-V is a RISC-V-compliant processor architecture incorporating a custom instruction set extension designed for efficient spiking neural network (SNN) computation, with particular emphasis on the Izhikevich neuron model. It features a deeply integrated Neuron Processing Unit (NPU) and Decay Unit (DCU) that augment the baseline integer pipeline to accelerate neuron state updates and synaptic current decay with single-cycle, fixed-point hardware instructions, significantly increasing compute density and energy efficiency for neuromorphic workloads (Szczerek et al., 18 Aug 2025).

1. Processor Architecture and Pipeline Integration

The IzhiRISC-V is derived from the DTEK-V baseline core, implementing the RV32IMZ instruction set (combining RV32I, M, and Zicsr). The core's pipeline is organized into three stages:

Fetch+Decode (IF/ID): Instruction fetch and decode are merged into a single stage.
Execute (EX): The main execution stage where both standard and neuromorphic instructions are dispatched.
Memory+Write-Back (MEM/WB): Memory access and data write-back are combined.

A forwarding unit mitigates read-after-write (RAW) hazards by propagating results back to the EX stage as needed, with stall cycles inserted only when unavoidable.

Neuromorphic enhancements are realized via hardware extensions directly merged into the pipeline's execution path:

The NPU and DCU are tightly grafted into the ALU datapath, sharing operand multiplexers and participating equally in the pipeline flow.
IF/ID recognizes neuromorphic instructions via the RISC-V custom-0 opcode ( $0001011_2$ ).
In the EX stage, the hazard unit examines $funct3$ to route to one of: standard ALU, NPU, or DCU.
Results from the NPU/DCU are written to registers or main memory based on instruction semantics.

This integration maintains RISC-V programming conventions, removing the need for context switches to co-processors, while allowing fast hardware-accelerated SNN computations.

2. Custom ISA Extension: Neuromorphic Instructions

IzhiRISC-V defines a dedicated ISA extension using the RISC-V "custom-0" opcode format for neuromorphic processing, particularly optimized for Izhikevich neuron simulation. Four new instructions are introduced:

Instruction	funct3	Format	Semantics
nmldl	000	R	Load Izhikevich parameters $a$ , $b$ , $c$ , $d$ into NPU configuration registers. Operands: $a$ , $b$ (Q4.11 fixed-point), $c$ (Q7.8), $d$ (Q4.11).
nmldh	001	R	Load time-step $h$ and clamp-voltage flag into NPU configuration ( $h$ -bit selects step size, $pin$ -bit sets voltage clamp behavior).
nmpn	010	N	Execute single-cycle forward Euler update for $(v, u)$ neuron state using NPU. Inputs: address of $VU$ state, $I_{syn}$ (Q15.16). Outputs updated $VU$ and a spike flag.
nmdec	011	R	Perform single-cycle synaptic current exponential decay; parameterized by $\tau$ and approximated via shift-and-add in the DCU.

All instructions follow R-type or N-type encodings, with the nmpn instruction using the destination register as both source ( $VU$ word) and for the spike flag return.

3. Izhikevich Neuron Model: Hardware Mapping

The NPU specializes in fixed-point implementations of the Izhikevich neuron model as described by the following equations:

Continuous time: $\frac{dv}{dt} = 0.04\,v^2 + 5\,v + 140 - u + I_{syn}, \quad \frac{du}{dt} = a(bv - u)$ with spike and reset applied when $v > V_{TH}$ : $v \leftarrow c,\quad u \leftarrow u + d$

Discretized for hardware as: $v_{n+1} = v_n + h\,(0.04\,v_n^2 + 5\,v_n + 140 - u_n + I_{syn}), \quad u_{n+1} = u_n + h\,a\,(b\,v_n - u_n)$

The NPU implements:

Quadratic and linear terms using pipelined multiplies and accumulator units, with Q7.8 and Q4.11 fixed-point formats to balance range and resolution.
Reset logic, which checks for threshold crossing and applies reset in the same cycle, setting an LSB spike flag.
Exponential synaptic decay via DCU, using shift-and-add methods with low error (e.g., division by 2, 3, 7, or 8 yields errors under 0.4%), as tabulated below.

Division	Approximation	Approx. Error
$x/2$	$x\gg1$	$0\,\%$
$x/3$	$(x\gg2)+(x\gg4)+\dots$	$0.3906\,\%$
$x/7$	$(x\gg3)+(x\gg6)+(x\gg9)$	$0.1953\,\%$
$x/8$	$x\gg3$	$0\,\%$

This mapping enables tight hardware loops for neuron evolution with robust numerical fidelity versus double/fixed MATLAB baselines.

4. Microarchitectural Details and Physical Resource Utilization

ALU augmentation integrates the NPU and DCU as functional units within the EX stage. Configuration registers @ NM_REGS track $a$ , $b$ , $c$ , $d$ , $h$ , $\tau$ , and the clamp flag. RTL is implemented in VHDL, using IEEE fixed-point (sfixed); resource optimizations (e.g., multiplier sharing) are intentionally omitted to maximize computation fidelity.

Synthesized utilization on Intel MAX10 FPGA (10M50DAF484C7G, dual-core @30 MHz):

Metric	Utilization
Logic elements	49,248 (99%)
Flip-Flops	28,235 (51%)
BRAM	346.5 Kb (21%)
9-bit multipliers	68 (24%)

Scalability projections on Intel Agilex-7 (100 MHz) suggest:

16 cores: 8% ALMs, 152 DSPs
32 cores: 17% ALMs, 304 DSPs
64 cores: 32% ALMs, 608 DSPs

After mapping to standard cells:

FreePDK45 (45 nm): 201.5 MHz, 67.6 M neuron-updates/s at 49.5 mW ($1.37$ GUpd/s/W), NPU area ≈ 20%, DCU < 2%
ASAP7 (7nm): 316.3 MHz, 105.4 M neuron-updates/s at 10.9 mW ($9.67$ GUpd/s/W)

5. Benchmarking: Performance and Energy Efficiency

Benchmarks incorporate both synthetic networks and application-driven scenarios.

80-20 network (1,000 neurons, 1,000 timesteps, $h=1$ ms):

Single-core: 7.87 s ($127,000$ neuron-updates/s).
Dual-core: 4.79 s ($209,000$ neuron-updates/s, $1.64\times$ speedup).
Effective IPC: $\sim$ 0.65 (ideal 1.0 without custom ops).
Cache hit rates: I-cache 99.97%, D-cache 96.5–97.2%.
Hazard stalls: 0.74% (single-core), 5.34–6.26% (dual-core).

Metric	Single-core	Dual-core (each)
Execution time [s]	7.870	4.791
Speed-up	1.00×	1.643×
IPC	0.574	0.532–0.519
Effective IPC	0.652	0.664–0.651

ISI distributions and spike rasters align with MATLAB double/fixed-point references, confirming numerical accuracy.

Sudoku Winner-Take-All (WTA) network (729 neurons):

Timestep latency: 2.06 ms (single-core), 1.22 ms (dual-core, $1.68\times$ speedup).
Speed-up over soft-float DTEK-V: ~40×.
D-cache hit: 100%.

Metric	Single-core	Dual-core
Timestep latency [ms]	2.0555	1.2223
Speed-up	1.00×	1.682×
IPC (avg)	0.530	0.496–0.419
Effective IPC (avg)	0.756	0.864–0.787
D-cache hit rate	100%	100%

ASIC implementations:

FreePDK45: $1.37$ GUpd/s/W at 49.5 mW
ASAP7: $9.67$ GUpd/s/W at 10.9 mW

6. Limitations and Prospective Enhancements

IzhiRISC-V demonstrates substantial acceleration of Izhikevich neuron networks, with the translation of a 19-operation software kernel into a single-cycle custom instruction, maintaining the RISC-V execution model. Numerical fidelity is preserved, indicated by comparable interspike interval histograms and spike rasters with reference simulations.

Noted limitations include:

Pipeline hazards induced by neuromorphic instructions, preventing ideal IPC ( $<1$ ) due to extra stalls.
Potential need for fixed-point retuning for specific biological or computational regimes.
Multi-core scaling is curtailed by memory bus contention and cache miss penalties.

Future directions cited in the literature are:

CSR-based spike flag and configuration success reporting to ameliorate register hazards.
Network-level instructions supporting operations such as sparse-spike broadcast or synapse accumulation.
Support for additional neuron models (e.g., LIF, Adaptive Exponential IF) via NPU microcode expansion.
Integration of lightweight on-chip routers or network-on-chip mesh for scaling core count beyond $\sim$ 64.

IzhiRISC-V provides an approach for coupling general-purpose processor design with domain-specific acceleration, enabling large-scale, energy-efficient neuromorphic computing within standard RISC-V platforms (Szczerek et al., 18 Aug 2025).

PDF Markdown Chat (Pro)

References (1)

IzhiRISC-V -- a RISC-V-based Processor with Custom ISA Extension for Spiking Neuron Networks Processing with Izhikevich Neurons (2025)

IzhiRISC-V: RISC-V for Neuromorphic SNNs

1. Processor Architecture and Pipeline Integration

2. Custom ISA Extension: Neuromorphic Instructions

3. Izhikevich Neuron Model: Hardware Mapping

4. Microarchitectural Details and Physical Resource Utilization

5. Benchmarking: Performance and Energy Efficiency

6. Limitations and Prospective Enhancements

Whiteboard

Follow Topic

Continue Learning

IzhiRISC-V: RISC-V for Neuromorphic SNNs

1. Processor Architecture and Pipeline Integration

2. Custom ISA Extension: Neuromorphic Instructions

3. Izhikevich Neuron Model: Hardware Mapping

4. Microarchitectural Details and Physical Resource Utilization

5. Benchmarking: Performance and Energy Efficiency

6. Limitations and Prospective Enhancements

Sponsor

Whiteboard

Follow Topic

Continue Learning

Related Topics