Optical Convolutional Processors

Updated 30 July 2025

Optical convolutional processors are photonic computing architectures that perform convolution operations using light’s interference, FFT, and matrix multiplications.
They leverage devices like MZIs, metasurfaces, and multiplexing techniques to achieve significant improvements in speed, energy efficiency, and latency compared to traditional electronics.
Key components, such as tunable phase shifters and integrated photonic meshes, enable scalable, highly parallel computations, though challenges like optical loss and alignment persist.

Optical convolutional processors are photonic computing architectures that directly implement convolutional operations—fundamental to deep learning and signal processing—by exploiting the parallelism, phase, and interference properties of light. These systems physically realize mathematical convolutions using a diverse variety of on-chip, free-space, and metasurface-based optical techniques, offering orders-of-magnitude improvements in throughput, energy efficiency, and latency compared to conventional electronic computation, especially for moderate problem sizes. The foundational principle is that passive or programmable photonic circuits can execute the fast Fourier transform (FFT) or direct space-domain matrix–vector and matrix–matrix multiplications, while leveraging the massive bandwidth and parallelism inherent to the optical domain.

1. Foundational Principles and Photonic Architectures

Optical convolutional processors operationalize the convolution operation through several approaches:

Optical Fast Fourier Transform (OFFT): Implementing the classical Cooley–Tukey FFT algorithm using cascaded Mach–Zehnder interferometers (MZIs), directional couplers, and phase shifters enables arithmetic operations necessary for the FFT’s butterfly network, where the optical phase encodes multiplication by twiddle factors and interference realizes addition/subtraction (George et al., 2017).
Diffractive and Metasurface Computing: Integrated diffractive optics, such as star couplers and nanophotonic metasurfaces, physically implement unitary transformations (notably the DFT), cross-correlation, and differentiation directly through tailored phase and amplitude modulation within a single passive element (Ong et al., 2020, Yu et al., 15 Mar 2025).
Matrix–Vector and Matrix–Matrix Multiplication: Photonic mesh architectures (composed of programmable MZIs), TDM-based intensity processors, and spatial–wavelength–temporal hyper-multiplexed ONNs encode operand vectors and weights in amplitude and/or wavelength, accumulating results in photodetectors for massively parallel MAC computations (Chai et al., 30 Jan 2025, Luan et al., 31 Mar 2025).
Wavelength and Mode Multiplexing: Parallelism is further enhanced by employing wavelength-division multiplexing (WDM), mode-division multiplexing (MDM), or programmable EO frequency combs, with each wavelength or mode encoding a dimension or kernel weight of the convolution (Tan et al., 2021, He et al., 23 Jun 2025).

For example, in the OFFT, the two principal operations—complex addition/subtraction and multiplication by a twiddle factor $E_{xy} = \exp(-i 2\pi xy/N)$ —are implemented using passive directional couplers and phase shifters, respectively, with delay lines to match the required data-arrival times in the FFT network (George et al., 2017). In the metasurface approach, double-phase encoding and polarization multiplexing realize full complex–amplitude operations without the need for bulky optics or sequential digital postprocessing (Yu et al., 15 Mar 2025).

2. Device-Level Implementation: Components and Integration

Key hardware primitives are ubiquitous across optical convolution processor designs:

Mach–Zehnder Interferometers (MZIs): The core element for programmable unitary transformations in the FFT, matrix multiplication networks, and phase-sensitive operations due to their capacity for high-fidelity phase and amplitude modulation (George et al., 2017, Mojaver et al., 2023).
Directional and Multimode Interferometers (MMIs): Utilized for waveguide splitting/combining, efficient routing, and implementation of star couplers for integrated DFTs (Ong et al., 2020, Dong et al., 30 Jan 2025).
Phase Shifters and Heaters: Integration of thermo-optic, electro-optic, or electrically-tunable phase shifters allows fine tuning of phase relationships critical for interference-based logic and compensation for environmental drift (Mojaver et al., 2022).
Programmable Frequency Combs and Microcombs: Tunable EO frequency combs or Kerr microcombs serve as scalable sources with hundreds of wavelengths, enabling simultaneous parallel MACs via wavelength encoding (Tan et al., 2021, He et al., 23 Jun 2025).
Optically Programmable Phase-Change Memory (OPCM): Allows in-memory optical computation by modulating transmission through GST-based cells, supporting both memory access and parallel multiply–accumulate within the storage medium (Sunny et al., 11 Jul 2024).
Metasurface and Nanoantenna Arrays: Subwavelength TiO₂ nanopillar arrays enable spatially varying phase and amplitude control for edge detection, correlation, and 3D holography in a single passive optical element (Yu et al., 15 Mar 2025).

A typical workflow involves high-speed DACs or optical modulators launching data into photonic circuits, cascade/interactions within the passive interferometric or dispersive network, and outputting intensity signals that are detected, optionally normalized/differenced to recover positive and negative values, and digitized for postprocessing or further neural network layers (Liu et al., 28 Jul 2025). Programmability is achieved via on-chip electronic or thermal tuning.

3. Mathematical Operation and Algorithmic Mapping

Optical convolutional processors map core mathematical convolutions onto hardware as follows:

Fourier-Domain Convolution: For a 1D or 2D signal $x[n]$ convolved with $h[n]$ ,

$y[n] = \mathcal{F}^{-1} [\mathcal{F}[x[n]] \cdot \mathcal{F}[h[n]]]$

where forward and inverse FFTs are mapped onto cascaded MZI networks or performed in free space via lensing and diffraction (George et al., 2017, Ahmed et al., 2020, Cottle et al., 2020).

Direct Space-Domain Multiplication: Matrix–vector and matrix–matrix multiplication are executed by encoding input vectors sequentially (e.g., via TDM (Chai et al., 30 Jan 2025)) or in parallel (e.g., via spatial and spectral channels (Luan et al., 31 Mar 2025)), applying programmed weights via modulators, and summing photodetected currents:

$\mathbf{y} = \mathbf{W} \mathbf{x}$

Structural Re-parameterization (SRP): For diffractive optical units, SRP is used to train the physical parameters (e.g., slot widths, phase shifts in metalines) such that the analog transfer function of the photonic device matches any real-valued digital kernel to high accuracy, minimizing mean-squared error between the optical simulation and target convolution (Huang et al., 2022).
Cross-Correlation and Higher-Order Operators: Metasurface platforms directly implement functions such as first-order differentiation, vertex detection, and Laplacian sharpening by encoding target transfer functions (e.g., $F(\partial f/\partial x) = (2\pi i k_x)\hat{f}(k_x,k_y)$ ) into physical phase/amplitude control (Yu et al., 15 Mar 2025).

Systems that employ nonlinear optical phenomena—such as four-wave mixing in joint transform correlators—execute the convolution in a single optical step using intensity-dependent mixing at the Fourier plane, reducing algorithmic complexity to $O(n^2)$ compared to $O(n^4)$ in classical 2D digital convolution (George et al., 2022).

4. Performance Metrics, Scalability, and Trade-offs

Multiple works rigorously quantify the performance of optical convolutional processors:

Latency and Speed: The all‐optical FFT and direct convolution are determined only by the physical propagation delay of light, reducing latencies to tens to hundreds of picoseconds (Ahmed et al., 2020, George et al., 2017). Image convolution at 11 TeraFLOPs/s and beyond 10 TOPS is experimentally demonstrated in microcomb-based systems (Xu et al., 2020, Tan et al., 2021).
Energy Efficiency: Passive (interference-based) arithmetic minimizes power consumption, with energy per MAC operation in the attojoule regime and compounded FoM of convolutions·s⁻¹·W⁻¹·m⁻² showing $10^2$ – $10^4$ × advantage over GPUs for small/moderate problem sizes (George et al., 2017, Sunny et al., 11 Jul 2024, Luan et al., 31 Mar 2025).
Scaling Behavior: For OFFT, area scales as $O(N\log_2 N)$ with moderate growth in optical loss; for mesh MVMs, intensity-based architectures reduce needed modulators to $O(N)$ via TDM, supporting large-scale scalar multiplication (Chai et al., 30 Jan 2025). Metasurface platforms achieve subwavelength resolution and handle up to $4000 \times 4000$ meta-atoms (Yu et al., 15 Mar 2025).
Trade-offs: Physical chip area and waveguide losses limit the scaling of certain architectures (e.g., delay-line-based OFFT) (George et al., 2017). In TDM and spatial–wavelength–multiplexed networks, high throughput implies larger photodetector arrays and increased complexity in optical alignment or signal parsing (Luan et al., 31 Mar 2025). Nonlinear JTCs demand careful engineering of the nonlinear material response and suffer from constraints on achievable gain and phase matching (George et al., 2022).

Experimental results support high classification accuracy: monolithic TFLN-based convolutional processors reach 96%/86% for MNIST/Fashion-MNIST and compress fully connected layers by 4–4.5× while retaining accuracy (Liu et al., 28 Jul 2025). Integrated photonic processors implement 1×1 and 2×2 convolution with image and segmentation applications, recording Dice scores of 0.658 on lung CT and 91.75% digit classification accuracy (Dong et al., 30 Jan 2025).

5. Programming, Calibration, and Robustness

Programmability and robustness are addressed through architectural innovations:

Efficient Programming of Interferometric Meshes: The Bokun mesh topology enables diagonal paths for independent phase monitoring of every MZI, supporting rapid calibration and in situ error correction without the need for full-network reconfiguration (Mojaver et al., 2023). This achieves up to 83% improvement in energy efficiency per programming cycle.
On-Chip Phase Monitoring: Dual-mode processors (e.g., multi-transverse-mode in silicon photonics) convert local phase shifts into easily monitored intensity changes in auxiliary modes, facilitating rapid and scalable calibration without external coherent detection (Mojaver et al., 2022).
Self-Correcting Photonic Networks: Robustness to component imperfection is demonstrated via retraining procedures for diffractive photonic CNNs, where performance degraded by phase/amplitude noise in star couplers or filters can be restored close to ideal through retraining on the parameterized hardware (Ong et al., 2020).
Metasurface and Passive Devices: Passive architectures exhibit low sensitivity to environmental drift, with only minimal reduction in fidelity for moderate fabrication variability (Yu et al., 15 Mar 2025).

6. Emerging Applications, Impact, and Future Directions

Optical convolutional processors are being actively developed for real-world applications:

Computer Vision and Pattern Recognition: Ultra-high-speed, energy-efficient CNN inference demonstrated for tasks such as handwritten digit and facial recognition at >10 TOPS and 88–96% accuracy levels (Xu et al., 2020, Liu et al., 28 Jul 2025).
Edge and 5G Networks: Integrated photonic FFT accelerators offer reduced latency and power consumption, well-aligned with the requirements of edge computing and baseband signal processing for 5G (Ahmed et al., 2020).
In-Sensor and Real-Time Learning: Programmable EO comb photonic processors enable kernel reconfiguration at $>$ 38 GHz, supporting adaptive computation for autonomous systems and in-sensor learning in robotics and drones, with architecture footprint-independent scalability (He et al., 23 Jun 2025).
Neuromorphic and Spiking Systems: Free-space OSCNNs mimic the computational structure of the visual cortex, employing Gabor filter banks, optical synchronizers, and spiking nonlinearity for low-latency, low-power image classification, and object detection (Ahmadi et al., 2023).
Analog Compute-in-Memory: OPIMA merges phase-change memory with in-situ optical MACs, demonstrating up to 2.98× higher throughput and 137× better energy efficiency than previous photonic PIM systems (Sunny et al., 11 Jul 2024).
Biomedical Imaging and 3D Holography: Passively encoded metasurface platforms support real-time edge detection, vertex/feature extraction, and subwavelength 3D meta-holography for high-fidelity clinical or display applications (Yu et al., 15 Mar 2025).

Anticipated research vectors include improved phase and amplitude programmability, tighter integration with mature digital control via FPGA and embedded systems, deeper scaling of kernel/matrix size enabled by advanced multiplexing, and further energy and speed optimizations via on-chip integration of actively controlled comb sources and low-loss delay architectures. Enhanced meta-atom fabrication and robust self-calibration techniques are expected to further improve real-world deployability.

7. Comparative Summary of Representative Architectures

Approach/Platform	Key Features	Representative Papers
On-chip silicon photonic OFFT	MZI/directional coupler FFT; $O(N\log N)$ area; $O(1)$ delay	(George et al., 2017, Ahmed et al., 2020)
Integrated star coupler DFT	Single-region diffractive DFT; reduced footprint	(Ong et al., 2020)
Programmable EO comb on TFLN	Monolithic multi-wavelength/weight; >1 TOPS; 38 GHz update	(He et al., 23 Jun 2025)
Kerr soliton microcomb vector accelerator	Temporal/wavelength/spatial interleaving; >10 TOPS	(Tan et al., 2021)
Free-space lens/SiPh hybrid OFT	$O(1)$ optical transform; >97% accuracy on MNIST	(Cottle et al., 2020)
Compute-in-memory OPCM (OPIMA)	Optical MACs in PCM main memory; 2.98×, 137× improvements	(Sunny et al., 11 Jul 2024)
Metasurface (TiO₂) analog processing	Edge/corner/correlation/holography; subwavelength control	(Yu et al., 15 Mar 2025)
Spiking SNN via free-space optics	Gabor filters, 4f correlators, synchronizer	(Ahmadi et al., 2023)
Intensity-based TDM photonic MVM	Single-λ, scalable, 93.47% MNIST; N modulator complexity	(Chai et al., 30 Jan 2025)
Multi-block integrated processor	Parallel, 1x1/2x2 convolution, hybrid Unet/classifier	(Dong et al., 30 Jan 2025)
Hyper-multiplexed spatial/λ/time ONN	Single-shot MMMs; 20 aJ/MAC; 292k weights; 96.4% accuracy	(Luan et al., 31 Mar 2025)
Monolithic TFLN convolutional processor	4×4 kernels, balanced detection, 96% MNIST, FPGA compatible	(Liu et al., 28 Jul 2025)

References to Key Research

Silicon photonic FFT: (George et al., 2017, Ahmed et al., 2020)
EO comb processors: (He et al., 23 Jun 2025)
Kerr microcomb accelerators: (Tan et al., 2021, Xu et al., 2020)
Metasurface analog computing: (Yu et al., 15 Mar 2025)
Compute-in-memory (OPIMA): (Sunny et al., 11 Jul 2024)
Spiking optical CNN: (Ahmadi et al., 2023)
Structural re-parameterization and oCNN: (Huang et al., 2022)

Optical convolutional processors constitute a dynamically evolving field, fusing advancements in integrated photonics, metasurface engineering, and computational optics. Their demonstrated capabilities, especially in terms of energy–throughput scaling for convolutional layers, position them as promising candidates for next-generation AI hardware in data centers, edge devices, and integrated autonomous systems.