Training of photonic neural networks through in situ backpropagation (1805.09943v1)

Published 25 May 2018 in physics.optics, cs.LG, cs.NE, physics.app-ph, and physics.data-an

Abstract: Recently, integrated optics has gained interest as a hardware platform for implementing machine learning algorithms. Of particular interest are artificial neural networks, since matrix-vector multi- plications, which are used heavily in artificial neural networks, can be done efficiently in photonic circuits. The training of an artificial neural network is a crucial step in its application. However, currently on the integrated photonics platform there is no efficient protocol for the training of these networks. In this work, we introduce a method that enables highly efficient, in situ training of a photonic neural network. We use adjoint variable methods to derive the photonic analogue of the backpropagation algorithm, which is the standard method for computing gradients of conventional neural networks. We further show how these gradients may be obtained exactly by performing intensity measurements within the device. As an application, we demonstrate the training of a numerically simulated photonic artificial neural network. Beyond the training of photonic machine learning implementations, our method may also be of broad interest to experimental sensitivity analysis of photonic systems and the optimization of reconfigurable optics platforms.

Citations (343)

View on Semantic Scholar

Summary

The paper presents a novel in situ backpropagation method leveraging adjoint variable and time-reversal interference techniques for direct gradient computation.
The methodology enables efficient training of photonic neural networks via intensity measurements, eliminating the need for external simulations.
Numerical results validate precise convergence in simulated tasks, indicating potential energy and computational cost benefits for large-scale systems.

In Situ Training of Photonic Neural Networks via Backpropagation

The implementation of machine learning algorithms using integrated photonics offers a promising hardware platform, particularly for executing the computationally intensive operations within artificial neural networks (ANNs). These operations, primarily extensive matrix-vector multiplications, can potentially be conducted more efficiently in photonic circuits due to the inherent characteristics of optical signals. However, a significant challenge in this domain is the training of photonic neural networks (PNNs) efficiently and directly on the photonic platform itself, as opposed to relying on external computational models. The paper under review introduces a novel methodology to achieve this objective through an in situ backpropagation approach.

Methodological Insights

The authors propose leveraging the adjoint variable method (AVM) to implement a photonic analogue of backpropagation. AVM is instrumental in photonics for tasks such as sensitivity analysis and optimization. By deriving this novel approach, the authors outline how gradients necessary for training the neural network can be computed precisely through intensity measurements within the photonic device itself. This removes the dependence on accurate external model simulations and enables training to occur directly on the hardware.

A pivotal aspect of the proposed method is the time-reversal interference method (TRIM), which is used to calculate the necessary gradients for backpropagation. TRIM enables the in situ computation of the gradient of a photonic feed-forward neural network’s cost function. The methodology involves three main phases: determining original and adjoint field solutions, utilizing the adjoint problem to provide a physical implementation of the computed gradient, and executing these interactions within the photonic circuit to measure intensity patterns that yield gradient information. This allows for efficient parallel adjustments of photonic circuit parameters.

Numerical Validation and Implementation

The authors validate their method numerically by training a simulated PNN to perform specific computational tasks, such as an XOR gate, using the approach outlined. The finesse of the method was demonstrated through simulations using finite-difference frequency-domain (FDFD) techniques, illustrating precise agreement between computed gradients and those obtained through the proposed experimental method. The results show that the photonic network successfully converges to the desired outcomes, underscoring the feasibility of training PNNs directly on photonic hardware.

Implications and Future Directions

Practically, the new method holds significant promise for reducing the energy and computational costs associated with training large-scale neural networks by utilizing the efficient computation capabilities of photonic systems. Theoretically, this lays a foundational framework for further development in this interdisciplinary area, bridging photonics and machine learning. Future research directions could explore scaling up the current method to handle more extensive networks with more complex architectures, thereby analyzing the performance and efficiency in different photonic platforms.

Moreover, this approach could catalyze advancements in adaptive photonic systems, enabling them to self-optimize and self-configure without relying on brute force computational approaches or heavily modeled simulations. While the authors note certain assumptions, such as the requirement of a lossless system and the need for compatible integrated measurement systems, advancements in photonic detector technology and design optimization may offset these limitations.

In conclusion, the paper provides a comprehensive and detailed pathway for implementing training in photonic neural networks directly on the photonic componentry, portending a new era of integrated photonic intelligence in machine learning applications.

PDF Markdown