Dice Question Streamline Icon: https://streamlinehq.com

Identify training methods and hardware for large-scale in situ photonic neural network training

Determine the combination of training algorithms and optical neural network hardware—such as integrated photonic circuits based on Mach–Zehnder interferometer meshes with suitable nonlinear activation functions—that enables large-scale training primarily in the photonic domain with minimal reliance on digital-electronic computation. Establish scalable architectures and procedures that can compute gradients and update weights in situ, and demonstrate that such approaches match or surpass state-of-the-art digital backpropagation in accuracy, speed, and energy efficiency for practical tasks.

Information Square Streamline Icon: https://streamlinehq.com

Background

Much recent work in optical neural networks has focused on leveraging photonics for fast, energy-efficient inference, typically training models on conventional digital computers and then porting weights to optical hardware. While hybrid and physics-aware schemes partially reduce reliance on digital training, achieving large-scale training predominantly with photonic hardware remains unproven.

The paper surveys backpropagation-based optical training, equilibrium propagation, Hamiltonian echo backpropagation, and forward–forward strategies, noting that each has limitations when scaled or applied to realistic optical hardware with noise and nonidealities. It emphasizes the need to co-design algorithms and hardware to minimize digital computation during training.

References

It is an open question what combination of training algorithm and ONN hardware will ultimately enable training of ONNs at large scale with minimal usage of digital-electronic hardware during training; it may well be the case that neither the currently known training methods nor the known hardware architectures and designs are what we will use.

Roadmap on Neuromorphic Photonics (2501.07917 - Brunner et al., 14 Jan 2025) in Training optical neural networks (Concluding Remarks)