Optical Coherent Dot-Product Chip
- OCDC is an integrated photonic processor that computes optical dot products by encoding vector elements in light’s phase and amplitude.
- It employs cascaded interferometers, inverse-designed nanophotonic cores, and programmable crossbars to achieve high-throughput, energy-efficient multiply–accumulate operations.
- OCDCs are pivotal in AI acceleration, scientific computing, and communications, offering robust on-chip calibration and scalable photonic-electronic integration.
An Optical Coherent Dot-Product Chip (OCDC) is an integrated photonic processor that computes the dot product of optical-encoded vectors, exploiting phase, amplitude, and coherence of light fields for high-throughput, low-energy multiply–accumulate operations. Core operations leverage optical interference within circuit-level architectures including cascaded interferometers, inverse-designed nanophotonic cavities, or arrayed coherent crossbars. OCDCs have evolved from theoretical proposals for modular arithmetic in cascaded Mach–Zehnder interferometers to compact silicon chips computed by design automation, and now serve as building blocks in optical neural network (ONN) accelerators, scientific computing systems, and AI-centric hardware.
1. Physical and Mathematical Foundations
OCDCs perform, in the optical domain, the fundamental level-1 BLAS operation:
This computation is enabled by encoding vector elements () into optical field amplitude or phase, then effecting their product and summation through interference and detection. In classical architectures, this is realized by a cascade of Mach–Zehnder interferometers (MZIs), where “memory” phases in each stage encode stored vector elements, and “control” phases select which elements participate via selective phase routing. The output optical field emerges with a phase proportional to the dot product, modulo :
where . Generalizations implement matrix–vector and matrix–matrix products by parallelizing the architecture across memory phase arrays and shared selectors (Pavlichin et al., 2014).
Advanced OCDCs employ direct amplitude–amplitude multiplication, using input field magnitudes and sign-encoded phases in engineered nanophotonic cavities. For two inputs , multiplexed as optical sources, the cavity is designed (via the identity ) to yield a differential photocurrent exactly proportional to (Mathur, 18 Jul 2025). Scaling this to 0-dimensional dot products is achieved by time, wavelength, or spatial multiplexing of cavity units.
2. Device Architectures and Implementation Strategies
2.1 Cascaded Interferometer Networks
The initial OCDC instantiation is a serial array of MZIs, each incorporating a memory phase and a control phase (Pavlichin et al., 2014). The architecture is governed by a lower-triangular mapping between selector vectors 1, physical control phases 2, and a “tail” phase ensuring port-unicity, all described by the overall circuit scattering matrix. Selector control is realized using low-power, nonvolatile phase shifts (thermo-optic, electro-optic, optomechanical), while readout is achieved via phase-sensitive interferometric detection.
2.2 Inverse-Designed Nanophotonic Cores
Modern OCDCs adopt inverse-design—gradient-based topology optimization of the dielectric profile—to realize ultracompact (<15 μm²) cores that merge mode multiplexers and coherent mixers into single silicon/air patterned regions (Zhu et al., 2024). Inputs are launched into distinct spatial or modal channels, processed coherently, and detected with balanced photodiodes measuring quadratures proportional to the real part of the vector dot product. Spectral, spatial, and modal degrees of freedom can all be harnessed, with crosstalk and insertion loss kept low by design.
2.3 Coherent Crossbar Arrays with Programmable Weights
For AI acceleration, OCDC arrays are extended to 2D programmable crossbars on integrated silicon photonic/CMOS platforms (Sturm et al., 2022). Here, input vectors modulate row waveguide amplitudes via optical DACs; programmable weights are implemented by nonvolatile phase change material (PCM) attenuation cells, and the columnar coherent sum is detected to yield weighted sums. The architecture is tightly integrated with electronic memory (SRAM/DRAM), digital backend, and thermal compensation. Dual-core arrangements eliminate write-latency bottlenecks for large batch sizes.
2.4 Regression-Capable, Numerically Complete ONN Accelerators
Silicon-based OCDCs have been demonstrated for high-precision, real-valued dot-product computation in deep-learning regression tasks (Xu et al., 2021). Push–pull MZM amplitude modulator cascades accept 3 and 4 as drive voltages, with all products coherently summed via inverse-couplers and read out by interference with a strong local oscillator. In-situ backpropagation uses measured outputs to iteratively tune phase shifter voltages, calibrating away device imperfections for reliable full-precision neural layers.
3. Performance Metrics and Device Characteristics
OCDC performance is assessed across throughput, energy, precision, and footprint:
- Throughput Density: Mode-multiplexed dot-product cores reach 5 TOPS/mm², limited by modulator/detector electronics (Zhu et al., 2024), and programmable crossbars reach 6 MAC/s/mm² (Sturm et al., 2022).
- Energy per MAC: Thermo-optic devices operate down to pJ/MAC; electro-optic upgrades and coherent architectures can yield 7 pJ/MAC, surpassing electronic accelerators (Xu et al., 2021, Sturm et al., 2022).
- Precision: Regression-capable OCDCs achieve normalized mean-square errors 8 in photonic FC and conv layers, approaching 32-bit digital performance in tasks such as AUTOMAP image reconstruction (Xu et al., 2021).
- Area Efficiency: Inverse-designed cavity engines yield up to 88% reduction in photonic core area for transformer-scale workloads (Mathur, 18 Jul 2025).
- Crosstalk and Bandwidth: Crosstalk in optimized mode-multiplexed cores is 99 dB (12%), bandwidths 040 nm (1540–1580 nm) (Zhu et al., 2024).
4. System Integration and Scalability
Monolithic integration of OCDC cores with electronic subsystems (CMOS logic, PCM memory, high-speed ADCs/DACs) is now established at the 45 nm node, facilitating tight photonic–electronic co-packaging (Sturm et al., 2022). Scaling to large M×N dot-product arrays utilizes spatial tiling, waveguide multiplexing, and dual-core “ping-pong” programming to sustain high throughput under realistic DRAM/SRAM traffic and weight update cycles. Isolation trenches, robust topology optimization, and thermal compensation schemes are employed to minimize cavity crosstalk and enable wafer-scale arrays (Mathur, 18 Jul 2025).
5. Algorithms, Self-Calibration, and Compensation
Photonic nonlinearities, phase drift, and fabrication variability are actively corrected by in-situ backpropagation (gradient descent on phase shifter settings), monitor taps, and digital feedback (Xu et al., 2021). Loss and phase errors in large-scale crossbars are balanced via careful coupler design, thermal shifter trim, and reference-channel calibration. Weighted dot-products can be adapted by converting phase selection to arbitrary weights using feedback loops in OCDC interferometric stages (Pavlichin et al., 2014).
6. Applications in AI, Scientific Computing, and Communications
OCDCs are foundational for hardware acceleration in machine learning, data-intensive scientific computing, and high-speed analog vector processing:
- AI/ML Inference: Implement convolution, FC, transformer, and coding kernels at throughput and power densities exceeding cutting-edge GPUs (A100 benchmark: 15× power and 7× area improvements) (Sturm et al., 2022).
- Regression Tasks: Capable of deploying regression networks (AUTOMAP) for tomographic, optical flow, and domain-inverse image-reconstruction pipelines, with photonic error rates close to digital (Xu et al., 2021, Zhu et al., 2024).
- Optical Memory/Storage: Controlled random-access read and syndrome extraction for LDPC and error-correcting codes (Pavlichin et al., 2014).
- Neural Acceleration: Dense programmable dot-product tiles for ONNs, plug-in replacement of conventional MZI-based inner-product engines (Mathur, 18 Jul 2025).
- Scientific and Medical Imaging: Real-time tomography, phase-sensitive data processing, and spectrally multiplexed computing (Xu et al., 2021).
7. Outlook and Future Directions
Key research trajectories for OCDCs include scaling to wider modes and spectral bands for matrix–tensor algebra (Zhu et al., 2024), fully integrated photonic–electronic systems with on-chip memory stacks (Sturm et al., 2022), nanophotonic design automation for robust mass manufacture (Mathur, 18 Jul 2025), and quantum-inspired enhancements exploiting coherent state manipulation beyond classical architectures (Pavlichin et al., 2014). Improved low-loss modulators, deep spatial multiplexing, and advanced calibration promise continued increases in speed, energy efficiency, and computational complexity addressable by OCDC-based hardware in AI-centric and scientific domains.