Papers
Topics
Authors
Recent
2000 character limit reached

IzhiRISC-V: RISC-V for Neuromorphic SNNs

Updated 16 December 2025
  • IzhiRISC-V is a RISC-V-compliant processor architecture featuring a custom instruction set for efficient spiking neural network computation based on the Izhikevich neuron model.
  • It integrates a Neuron Processing Unit (NPU) and a Decay Unit (DCU) directly into the pipeline to achieve single-cycle state updates and synaptic decay with fixed-point arithmetic.
  • Performance benchmarks show enhanced compute density and energy efficiency, demonstrating effective multi-core scaling and minimized pipeline hazards.

IzhiRISC-V is a RISC-V-compliant processor architecture incorporating a custom instruction set extension designed for efficient spiking neural network (SNN) computation, with particular emphasis on the Izhikevich neuron model. It features a deeply integrated Neuron Processing Unit (NPU) and Decay Unit (DCU) that augment the baseline integer pipeline to accelerate neuron state updates and synaptic current decay with single-cycle, fixed-point hardware instructions, significantly increasing compute density and energy efficiency for neuromorphic workloads (Szczerek et al., 18 Aug 2025).

1. Processor Architecture and Pipeline Integration

The IzhiRISC-V is derived from the DTEK-V baseline core, implementing the RV32IMZ instruction set (combining RV32I, M, and Zicsr). The core's pipeline is organized into three stages:

  • Fetch+Decode (IF/ID): Instruction fetch and decode are merged into a single stage.
  • Execute (EX): The main execution stage where both standard and neuromorphic instructions are dispatched.
  • Memory+Write-Back (MEM/WB): Memory access and data write-back are combined.

A forwarding unit mitigates read-after-write (RAW) hazards by propagating results back to the EX stage as needed, with stall cycles inserted only when unavoidable.

Neuromorphic enhancements are realized via hardware extensions directly merged into the pipeline's execution path:

  • The NPU and DCU are tightly grafted into the ALU datapath, sharing operand multiplexers and participating equally in the pipeline flow.
  • IF/ID recognizes neuromorphic instructions via the RISC-V custom-0 opcode (000101120001011_2).
  • In the EX stage, the hazard unit examines funct3funct3 to route to one of: standard ALU, NPU, or DCU.
  • Results from the NPU/DCU are written to registers or main memory based on instruction semantics.

This integration maintains RISC-V programming conventions, removing the need for context switches to co-processors, while allowing fast hardware-accelerated SNN computations.

2. Custom ISA Extension: Neuromorphic Instructions

IzhiRISC-V defines a dedicated ISA extension using the RISC-V "custom-0" opcode format for neuromorphic processing, particularly optimized for Izhikevich neuron simulation. Four new instructions are introduced:

Instruction funct3 Format Semantics
nmldl 000 R Load Izhikevich parameters aa, bb, cc, dd into NPU configuration registers. Operands: aa, bb (Q4.11 fixed-point), cc (Q7.8), dd (Q4.11).
nmldh 001 R Load time-step hh and clamp-voltage flag into NPU configuration (hh-bit selects step size, pinpin-bit sets voltage clamp behavior).
nmpn 010 N Execute single-cycle forward Euler update for (v,u)(v, u) neuron state using NPU. Inputs: address of VUVU state, IsynI_{syn} (Q15.16). Outputs updated VUVU and a spike flag.
nmdec 011 R Perform single-cycle synaptic current exponential decay; parameterized by τ\tau and approximated via shift-and-add in the DCU.

All instructions follow R-type or N-type encodings, with the nmpn instruction using the destination register as both source (VUVU word) and for the spike flag return.

3. Izhikevich Neuron Model: Hardware Mapping

The NPU specializes in fixed-point implementations of the Izhikevich neuron model as described by the following equations:

Continuous time: dvdt=0.04v2+5v+140u+Isyn,dudt=a(bvu)\frac{dv}{dt} = 0.04\,v^2 + 5\,v + 140 - u + I_{syn}, \quad \frac{du}{dt} = a(bv - u) with spike and reset applied when v>VTHv > V_{TH}: vc,uu+dv \leftarrow c,\quad u \leftarrow u + d

Discretized for hardware as: vn+1=vn+h(0.04vn2+5vn+140un+Isyn),un+1=un+ha(bvnun)v_{n+1} = v_n + h\,(0.04\,v_n^2 + 5\,v_n + 140 - u_n + I_{syn}), \quad u_{n+1} = u_n + h\,a\,(b\,v_n - u_n)

The NPU implements:

  • Quadratic and linear terms using pipelined multiplies and accumulator units, with Q7.8 and Q4.11 fixed-point formats to balance range and resolution.
  • Reset logic, which checks for threshold crossing and applies reset in the same cycle, setting an LSB spike flag.
  • Exponential synaptic decay via DCU, using shift-and-add methods with low error (e.g., division by 2, 3, 7, or 8 yields errors under 0.4%), as tabulated below.
Division Approximation Approx. Error
x/2x/2 x1x\gg1 0%0\,\%
x/3x/3 (x2)+(x4)+(x\gg2)+(x\gg4)+\dots 0.3906%0.3906\,\%
x/7x/7 (x3)+(x6)+(x9)(x\gg3)+(x\gg6)+(x\gg9) 0.1953%0.1953\,\%
x/8x/8 x3x\gg3 0%0\,\%

This mapping enables tight hardware loops for neuron evolution with robust numerical fidelity versus double/fixed MATLAB baselines.

4. Microarchitectural Details and Physical Resource Utilization

ALU augmentation integrates the NPU and DCU as functional units within the EX stage. Configuration registers @ NM_REGS track aa, bb, cc, dd, hh, τ\tau, and the clamp flag. RTL is implemented in VHDL, using IEEE fixed-point (sfixed); resource optimizations (e.g., multiplier sharing) are intentionally omitted to maximize computation fidelity.

Synthesized utilization on Intel MAX10 FPGA (10M50DAF484C7G, dual-core @30 MHz):

Metric Utilization
Logic elements 49,248 (99%)
Flip-Flops 28,235 (51%)
BRAM 346.5 Kb (21%)
9-bit multipliers 68 (24%)

Scalability projections on Intel Agilex-7 (100 MHz) suggest:

  • 16 cores: 8% ALMs, 152 DSPs
  • 32 cores: 17% ALMs, 304 DSPs
  • 64 cores: 32% ALMs, 608 DSPs

After mapping to standard cells:

  • FreePDK45 (45 nm): 201.5 MHz, 67.6 M neuron-updates/s at 49.5 mW ($1.37$ GUpd/s/W), NPU area ≈ 20%, DCU < 2%
  • ASAP7 (7nm): 316.3 MHz, 105.4 M neuron-updates/s at 10.9 mW ($9.67$ GUpd/s/W)

5. Benchmarking: Performance and Energy Efficiency

Benchmarks incorporate both synthetic networks and application-driven scenarios.

80-20 network (1,000 neurons, 1,000 timesteps, h=1h=1 ms):

  • Single-core: 7.87 s ($127,000$ neuron-updates/s).
  • Dual-core: 4.79 s ($209,000$ neuron-updates/s, 1.64×1.64\times speedup).
  • Effective IPC: \sim0.65 (ideal 1.0 without custom ops).
  • Cache hit rates: I-cache 99.97%, D-cache 96.5–97.2%.
  • Hazard stalls: 0.74% (single-core), 5.34–6.26% (dual-core).
Metric Single-core Dual-core (each)
Execution time [s] 7.870 4.791
Speed-up 1.00× 1.643×
IPC 0.574 0.532–0.519
Effective IPC 0.652 0.664–0.651

ISI distributions and spike rasters align with MATLAB double/fixed-point references, confirming numerical accuracy.

Sudoku Winner-Take-All (WTA) network (729 neurons):

  • Timestep latency: 2.06 ms (single-core), 1.22 ms (dual-core, 1.68×1.68\times speedup).
  • Speed-up over soft-float DTEK-V: ~40×.
  • D-cache hit: 100%.
Metric Single-core Dual-core
Timestep latency [ms] 2.0555 1.2223
Speed-up 1.00× 1.682×
IPC (avg) 0.530 0.496–0.419
Effective IPC (avg) 0.756 0.864–0.787
D-cache hit rate 100% 100%

ASIC implementations:

  • FreePDK45: $1.37$ GUpd/s/W at 49.5 mW
  • ASAP7: $9.67$ GUpd/s/W at 10.9 mW

6. Limitations and Prospective Enhancements

IzhiRISC-V demonstrates substantial acceleration of Izhikevich neuron networks, with the translation of a 19-operation software kernel into a single-cycle custom instruction, maintaining the RISC-V execution model. Numerical fidelity is preserved, indicated by comparable interspike interval histograms and spike rasters with reference simulations.

Noted limitations include:

  • Pipeline hazards induced by neuromorphic instructions, preventing ideal IPC (<1<1) due to extra stalls.
  • Potential need for fixed-point retuning for specific biological or computational regimes.
  • Multi-core scaling is curtailed by memory bus contention and cache miss penalties.

Future directions cited in the literature are:

  • CSR-based spike flag and configuration success reporting to ameliorate register hazards.
  • Network-level instructions supporting operations such as sparse-spike broadcast or synapse accumulation.
  • Support for additional neuron models (e.g., LIF, Adaptive Exponential IF) via NPU microcode expansion.
  • Integration of lightweight on-chip routers or network-on-chip mesh for scaling core count beyond \sim64.

IzhiRISC-V provides an approach for coupling general-purpose processor design with domain-specific acceleration, enabling large-scale, energy-efficient neuromorphic computing within standard RISC-V platforms (Szczerek et al., 18 Aug 2025).

Definition Search Book Streamline Icon: https://streamlinehq.com
References (1)

Whiteboard

Follow Topic

Get notified by email when new papers are published related to IzhiRISC-V.