Programmable Incoherent Photonic Neuromorphic Chip

Updated 4 July 2026

Programmable incoherent photonic neuromorphic computing chips are devices that use optical intensity and programmable photonic elements for neural computations without relying on coherent interference.
They leverage attenuation, mode converters, and microring transmissions to execute analog multiply–accumulate operations, reducing calibration overhead and phase noise sensitivity.
Experimental results demonstrate sub-nanosecond inference and energy-efficient performance, making these systems promising for high-speed, robust, and scalable neural applications.

A programmable incoherent photonic neuromorphic computing chip is an integrated photonic system in which neural computation is implemented in the optical intensity domain rather than through phase-sensitive interferometric processing, with weights, neuron operating points, or material states set by programmable photonic elements. In this class of hardware, the canonical neural operation remains $z_i = b_i + \Sigma_j W_{ij} x_j$ , followed by $y_i = \varphi(z_i)$ , but weighting is realized by optical attenuation, modal conversion, microring transmission, or simplified interferometric meshes operated with direct detection, while summation occurs through photocurrent or incoherent power addition. The resulting systems are intended to reduce sensitivity to phase noise, relax interferometric stabilization requirements, and, in several recent demonstrations, move not only linear matrix–vector multiplication but also nonlinear or spiking activation closer to the photonic substrate (Li et al., 2023, Xiang et al., 9 Aug 2025).

1. Conceptual basis and operating regime

Incoherent photonic neuromorphic computing refers to architectures in which only optical intensities or powers are computationally relevant. Distinct wavelength channels, spatial modes, or transmission amplitudes encode the signal, and readout relies on direct detection or balanced photodetection rather than coherent field recovery. This contrasts with coherent interferometric optical neural networks, especially programmable MZI meshes implementing unitary transforms via phase control, where computation depends directly on stable optical phase relationships and therefore inherits sensitivity to phase noise, thermal drift, and calibration overhead (Li et al., 2023).

Within this intensity-domain regime, programmability can take several forms. In photonic classifier chips and tensor-core-like accelerators, weights are programmed as transmission states of attenuators, microrings, or phase-change materials. In spiking neuromorphic systems, neuron thresholds and excitability are also tunable, so programmability spans both synaptic weighting and nonlinear dynamics. The distinguishing point is not merely the use of optics for matrix multiplication, but the combination of programmable weight storage or routing with an incoherent signal path whose output is determined by power superposition and thresholding rather than phase-resolved interference (Xiang et al., 17 Jun 2025).

A central motivation, articulated explicitly in recent work, is to remove the “linear-only” bottleneck of prior photonic accelerators. Many earlier photonic neural chips accelerate matrix–vector multiplication but return nonlinear activation to electronics through OE/EO plus AD/DA conversions. That back-and-forth adds latency and power and hinders cascaded optical layers. The most advanced programmable incoherent platforms therefore aim either to minimize the electronic role to biasing and readout or to implement both linear and nonlinear spike computations in the optical domain (Xiang et al., 9 Aug 2025).

2. Device architectures and implementation families

Several distinct hardware families now fall under the topic.

Platform	Programmable weight mechanism	Nonlinearity / neuron mechanism
16-channel photonic spiking layer (Xiang et al., 9 Aug 2025)	Simplified 16×16 MZI mesh on SOI	16-channel DFB-SA laser neuron array on III–V
Multimode PMMC core (Wu et al., 2020)	GST phase-change metasurface mode converters on SiN waveguides	ReLU and sigmoid handled electronically between photonic layers
CMOS PSNN chip (Xiang et al., 17 Jun 2025)	MRR synapse arrays with photoconductive heaters	MRM spiking neurons driven by photodetected currents
Low-loss PMC tensor core (Shen et al., 11 Mar 2026)	Sb2Se3 mode converters on ultrathin SOI waveguides	Balanced-photodetector readout for incoherent MAC

The 2025 spiking reinforcement-learning chip integrates two subsystems fabricated on different material platforms and co-optimized for spiking computation: a simplified 16×16 Mach–Zehnder interferometer mesh on silicon-on-insulator for linear synaptic weighting and a 16-channel distributed-feedback laser with saturable absorber array on a III–V platform for nonlinear spike generation. Together they form a 16-channel, fully optical SNN layer with 272 trainable parameters: $16\times16$ synaptic weights and 16 per-channel activation thresholds (Xiang et al., 9 Aug 2025).

A different approach appears in the multimode phase-change metasurface platform, where a SiN waveguide supports TE0 and TE1 modes and each PMMC contains a linear array of 25 $\mathrm{Ge_2Sb_2Te_5}$ nano-antennae. Here the programmable quantity is not interferometric phase in a mesh but modal conversion contrast, read out by separating TE0 and TE1 powers and electronically differencing them. The demonstrated core is a 2×2 PMMC array acting as a 2×2 photonic kernel for convolution (Wu et al., 2020).

The CMOS-compatible PSNN adopts a broadcast-and-weight architecture. Input spike trains are wavelength-division multiplexed, broadcast by power splitters, weighted by MRR synapse arrays, summed by on-chip photodetectors, and used to drive optoelectronic microring spiking neurons. Operation is explicitly incoherent: optical carriers at distinct wavelengths are intensity modulated, synapses control transmission amplitude, photodetectors sum powers rather than phases, and nonlinear activation arises from free-carrier dynamics in electrically driven MRMs (Xiang et al., 17 Jun 2025).

A further branch replaces GST with low-loss $\mathrm{Sb_2Se_3}$ and encodes weights through refractive-index-induced mode conversion rather than absorption. In that design, a shallow-etched slot in a 70-nm-thick SOI strip waveguide is filled with a 25-nm $\mathrm{Sb_2Se_3}$ patch, and the programmable state modulates TE0↔TE1 conversion and port contrast. This architecture is explicitly presented as an incoherent multimode, WDM-enabled tensor core (Shen et al., 11 Mar 2026).

3. Linear weighting, signed arithmetic, and optical nonlinearity

The linear stage in programmable incoherent photonic chips is typically a form of analog multiply–accumulate or matrix–vector multiplication. The key engineering problem is to realize real-valued or signed weights without reverting to large coherent meshes or extensive electronic correction.

In the 16-channel spiking RL chip, the MZI mesh is simplified for SNNs by using an $(N+1)\times(N+1)$ unitary mesh followed by an $N\times N$ diagonal matrix to implement an $N\times N$ real-valued weight matrix. For $N=16$ , the total number of phase shifters is $y_i = \varphi(z_i)$ 0, the network depth is $y_i = \varphi(z_i)$ 1, each MZI has a single phase shifter on one inner arm, the half-wave phase shift requires a loaded power $y_i = \varphi(z_i)$ 2 mW, the packaged mesh area is $y_i = \varphi(z_i)$ 3, the 3-dB optical bandwidth is $y_i = \varphi(z_i)$ 4 nm, and total insertion loss is $y_i = \varphi(z_i)$ 5 dB. Although MZIs are present, the inputs are incoherent multi-wavelength CW carriers and the outputs are processed by direct detection, so system behavior depends on intensity transfer rather than optical phase locking (Xiang et al., 9 Aug 2025).

The PMMC platform encodes weights through modal contrast. The modal purity is defined as

$y_i = \varphi(z_i)$ 6

and the programmable contrast parameter is

$y_i = \varphi(z_i)$ 7

with theoretical range $y_i = \varphi(z_i)$ 8. The device achieves 64 distinct, repeatable contrast levels, corresponding to 6-bit resolution, and spans approximately $y_i = \varphi(z_i)$ 9 to $16\times16$ 0 at 1555 nm, thereby supporting both positive and negative weights without offset. The resulting MAC is read as $16\times16$ 1, giving $16\times16$ 2 (Wu et al., 2020).

The low-loss $16\times16$ 3 PMC platform uses a similar but not identical signed-weight strategy. The converter supports only TE0 and TE1, with computed effective indices $16\times16$ 4 and $16\times16$ 5 at 1550 nm, giving the phase-matching period $16\times16$ 6. The programmed mode-contrast parameter $16\times16$ 7 sweeps from $16\times16$ 8 to $16\times16$ 9 across 32 states, and balanced photodetection computes an effective $\mathrm{Ge_2Sb_2Te_5}$ 0 with $\mathrm{Ge_2Sb_2Te_5}$ 1 set by $\mathrm{Ge_2Sb_2Te_5}$ 2 (Shen et al., 11 Mar 2026).

Nonlinearity is the sharper discriminator among architectures. In the PMMC optical CNN, ReLU and sigmoid are handled electronically between photonic layers, and signals are re-encoded optically for subsequent layers. This architecture is therefore programmable and incoherent in its linear algebra, but not fully optical in its layerwise nonlinearity (Wu et al., 2020). By contrast, the 16-channel RL chip implements nonlinear spike activation optically: under incoherent optical injection, each DFB-SA neuron exhibits a leaky integrate-and-fire-like threshold response, with no spike below threshold and a stereotyped spike above threshold whose amplitude is largely independent of input amplitude because the saturable absorber clips the response (Xiang et al., 9 Aug 2025).

The CMOS PSNN realizes a different spiking nonlinearity. The neuron is an electrically driven microring modulator integrated with a p–n diode; the summed photodetector current injects carriers into the MRM, the neuron integrates carrier density, and when the accumulated density crosses an excitability threshold the MRM emits an optical spike. Bifurcation analysis reveals a subcritical Hopf bifurcation at injected carrier density $\mathrm{Ge_2Sb_2Te_5}$ 3, consistent with a Class II resonate-and-fire neuron, and ion implantation raising doping from $\mathrm{Ge_2Sb_2Te_5}$ 4 to $\mathrm{Ge_2Sb_2Te_5}$ 5 enables a maximum firing rate of 4 GHz with a refractory interval of $\mathrm{Ge_2Sb_2Te_5}$ 6 ns (Xiang et al., 17 Jun 2025).

4. Programming, calibration, and hardware-aware learning

Programmability in these chips is not limited to writing static weights; it also includes calibration against fabrication variation, tuning of neuron thresholds, and, in the most advanced cases, in-situ or collaborative training procedures that close the sim-to-real gap.

One programming model is offline training followed by direct write-in. The PMMC optical CNN trains kernels and fully connected weights offline via backpropagation, then programs the trained values into GST metasurfaces using 50 ns optical control pulses that reach stable nonvolatile intermediate states. This provides 64 stable weight levels and preserves the in-memory character of the computation (Wu et al., 2020).

A more elaborate model appears in photonic spiking reinforcement learning. The actor–critic system combines an SNN actor and an ANN critic, with the actor deployed on chip during inference. The workflow has four stages: software pre-training with surrogate gradients at a single time step $\mathrm{Ge_2Sb_2Te_5}$ 7; local photonic in-situ training of the MZI mesh by stochastic parallel gradient descent; hardware-aware software fine-tuning with hardware weights fixed; and hardware-aware collaborative inference in which the photonic chip executes the L2 linear layer and the DFB-SA array executes the hardware spike activation. In the MZI-mapping step, the similarity between target and measured matrices is

$\mathrm{Ge_2Sb_2Te_5}$ 8

where $\mathrm{Ge_2Sb_2Te_5}$ 9 and $\mathrm{Sb_2Se_3}$ 0; random perturbations $\mathrm{Sb_2Se_3}$ 1 of phase-shifter voltages generate similarities $\mathrm{Sb_2Se_3}$ 2 and $\mathrm{Sb_2Se_3}$ 3, and $\mathrm{Sb_2Se_3}$ 4 estimates the local gradient used to update voltages (Xiang et al., 9 Aug 2025).

The CMOS PSNN demonstrates true in-situ supervised synaptic plasticity rather than export-and-map programming. A Python controller orchestrates lasers, AWG, voltage sources, and oscilloscope via PyVISA; the chip performs on-chip weighting and spiking; spike timing is extracted from measured outputs; and synaptic heater voltages are updated accordingly. The modified ReSuMe rule uses the timing difference between desired teacher spikes and actual postsynaptic response. The learning window is

$\mathrm{Sb_2Se_3}$ 5

with $\mathrm{Sb_2Se_3}$ 6, $\mathrm{Sb_2Se_3}$ 7 ns, $\mathrm{Sb_2Se_3}$ 8 ns, and learning rate $\mathrm{Sb_2Se_3}$ 9 (Xiang et al., 17 Jun 2025).

These workflows show that “programmable” has acquired a broader meaning in neuromorphic photonics. It includes thermally tuned MZI phase shifters, in-ring photoconductive heaters in MRRs, neuron bias control in MRMs or DFB-SA devices, nonvolatile phase-change states in PCM metasurfaces, and algorithmic compensation stages that adapt network weights to analog noise, drift, and fabrication variability. This suggests that programmability is now inseparable from calibration and training methodology rather than being only a property of a passive photonic fabric (Li et al., 2023).

5. Experimental benchmarks and reported performance

The most fully integrated demonstration of a programmable incoherent photonic spiking layer reports both photonic linear and photonic nonlinear performance metrics. With clocking limited by the $\mathrm{Sb_2Se_3}$ 0 GHz self-pulsation limit, the MZI mesh achieves an estimated linear MVM throughput of 2.5 TOPS; with total power $\mathrm{Sb_2Se_3}$ 1 W and area 19.55 mm², the energy efficiency is 1.39 TOPS/W and the compute density is 0.13 TOPS/mm². The 16-channel DFB-SA array provides 640 GOPS of element-wise spike activations; with power $\mathrm{Sb_2Se_3}$ 2 W and area $\mathrm{Sb_2Se_3}$ 3 mm², the energy efficiency is 987.65 GOPS/W and the compute density is 533.33 GOPS/mm². The measured/calculated latency for one full photonic SNN layer is 320 ps (Xiang et al., 9 Aug 2025).

On reinforcement-learning tasks, the same system demonstrates both discrete and continuous control. For CartPole, the actor is $\mathrm{Sb_2Se_3}$ 4 and the critic is $\mathrm{Sb_2Se_3}$ 5; baseline software training reaches reward 200 at epoch 429 and stabilizes after $\mathrm{Sb_2Se_3}$ 6 epochs, while hardware-aware fine-tuning reaches reward 200 at epoch 38 and stabilizes by $\mathrm{Sb_2Se_3}$ 7 epochs. In hardware inference on 200 state–action pairs, the DFB-SA spike activations have a layer error rate of 0.6875%, end-to-end accuracy is 98.5%, and reward converges to 200, matching conventional PPO. For Pendulum, the actor is $\mathrm{Sb_2Se_3}$ 8 and the critic is $\mathrm{Sb_2Se_3}$ 9; hardware inference on 200 pairs yields a layer error rate of 0.125%, 98% end-to-end accuracy, and reward converges to $(N+1)\times(N+1)$ 0, consistent with standard PPO (Xiang et al., 9 Aug 2025).

The CMOS PSNN reports a different benchmark profile. It achieves 80% accuracy on both train and test for a three-class subset of KTH human action recognition after in-situ convergence, using frame-free retina-inspired event encoding. The encoded pixel time slot is 0.28 ns, giving an effective per-frame time rate of 3.57 GHz; the end-to-end classification time for a 40-keyframe video is 62.76 ns, corresponding to 15.93 MHz overall video processing speed. The system reports computing density 114.3 GMAC/s/mm² and energy efficiency greater than 2.75 GSOP/W, and the event-driven frame-free pipeline is described as operating at $(N+1)\times(N+1)$ 1 faster processing speeds than conventional frame-based approaches (Xiang et al., 17 Jun 2025).

The PMMC convolutional and classification chip demonstrates a slower laboratory prototype but a strong precision and compactness argument. The present experiment operates at $(N+1)\times(N+1)$ 2 kHz because of low-speed VOAs used for amplitude encoding, but the authors project many 10s of Gb/s per channel with integrated transmitters and photodetectors, and propose 4-WDM parallelism. The reported throughput density under that projection is $(N+1)\times(N+1)$ 3 TOPS/mm². Experimentally, the compact optical CNN achieves 91% accuracy on MNIST “1” vs “2” classification over 100 test images, comparable to a 90% digital baseline, while the PMMC itself provides 6-bit resolution, TE0 insertion loss as low as 0.9 dB at 1575 nm, and complete device footprint of $(N+1)\times(N+1)$ 4 (Wu et al., 2020).

More broadly, review-level comparisons place such intensity-domain programmable chips among the most mature sub-nanosecond photonic neural architectures. Representative values cited for a programmable noninterferometric PDNN are 2.9 TOP/J, 3.5 TOP/mm²/s, and latency under 570 ps, while a photonic tensor core based on PCM and WDM reports $(N+1)\times(N+1)$ 5 fJ/MAC, bandwidth greater than 14 GHz, 0.4 TOP/J, and 1.2 TOP/mm²/s. These comparisons reinforce that programmable incoherent photonic chips are already competitive in latency and often attractive in energy efficiency, even though peta-level performance remains exceptional rather than typical (Li et al., 2023).

6. Limitations, misconceptions, and scaling directions

A common misconception is that “photonic neural chip” implies fully optical end-to-end execution. The experimental record is more heterogeneous. Some systems, such as the PMMC optical CNN, keep the heavy linear algebra in optics but perform bias addition and nonlinear activation electronically, with optical re-encoding between layers. Others, such as the 16-channel photonic RL chip, implement both linear MVM and spike activation optically within one layer. The distinction matters because OE/EO interfaces remain a primary source of latency and power overhead in many otherwise photonic accelerators (Wu et al., 2020, Xiang et al., 9 Aug 2025).

Another misconception is that incoherent operation means reduced capability. In fact, intensity-domain schemes support several weight-encoding mechanisms: signed modal contrast in multimode PCM devices, thermal or electrical tuning of MRR transmissions, neuron bias control, and balanced-photodetector readout. Their principal advantage is robustness to phase noise and the avoidance of interferometric stabilization. The main tradeoff is that arbitrary complex-valued linear algebra is less direct than in coherent meshes, and signed weights may require special constructions such as add-drop MRRs with balanced photodetectors or modal differencing (Niekerk et al., 2022, Xiang et al., 17 Jun 2025).

The leading limitations are now architecture-specific rather than conceptual. In GST-based PMMC systems, the TE1 path has $(N+1)\times(N+1)$ 6 dB insertion loss because of crystalline GST absorption, combining is partly off-chip, system speed is limited by external amplitude modulators, and programmed states can drift over time. In the CMOS PSNN, the demonstrated hardware scale is still limited: the 4096×5 video task used electronic-domain linear multiplication because the on-chip synapse array was too small, and multiple output neurons were time-multiplexed in one MRM. In review-level assessments, volatile weights, thermal crosstalk, detector linearity, shot noise, routing density, packaging, and the continued difficulty of on-chip nonlinear activation remain recurrent bottlenecks (Wu et al., 2020, Xiang et al., 17 Jun 2025, Li et al., 2023).

Low-loss PCM strategies address one major scalability bottleneck directly. The $(N+1)\times(N+1)$ 7 PMC work argues that the central issue with GST crossbar arrays is the high extinction coefficient $(N+1)\times(N+1)$ 8 of crystalline GST at telecom wavelengths, which causes cumulative loss and keeps practical arrays near tens of cells per row. In the low-loss alternative, $(N+1)\times(N+1)$ 9 near 1550 nm in both phases, while $N\times N$ 0 remains available for programming through mode conversion. Using a simple loss model with threshold $N\times N$ 1 dB, GST arrays are estimated at about $N\times N$ 2 in the worst crystalline case and about $N\times N$ 3 in the amorphous case, whereas the per-device insertion loss of $N\times N$ 4 dB in the $N\times N$ 5 PMC projects feasibility of arrays at least $N\times N$ 6 (Shen et al., 11 Mar 2026).

Scaling directions in recent work are therefore relatively concrete. The 16-channel photonic RL architecture is described as extendable because DFB-SA neuron arrays can be extended to approximately 150 channels with high wavelength precision and MZI meshes can scale to 64×64 or 128×128 with acceptable loss. The CMOS PSNN emphasizes full foundry compatibility and the feasibility of deeper PSNNs with available silicon photonic libraries. The review literature, however, remains cautious: calibration overhead, long-term drift, weight retention, and the need for better detector, modulator, and nonvolatile-memory performance still separate current sub-nanosecond demonstrators from broadly deployable large-scale neuromorphic photonic processors (Xiang et al., 9 Aug 2025, Xiang et al., 17 Jun 2025, Li et al., 2023).

In that sense, programmable incoherent photonic neuromorphic computing is best understood not as a single device category but as an emerging hardware regime. Its defining features are intensity-domain computation, programmable synaptic or modal weighting, direct-detection summation, and an increasingly diverse set of nonlinear mechanisms ranging from electronic ReLU blocks to GHz spiking photonic neurons. The field’s present significance lies in showing that these ingredients can already support sub-nanosecond inference, event-driven spiking dynamics, and experimentally validated control or vision tasks, while its unresolved questions concern scalable nonlinearity, long-term stability, and large-array integration.