Roadmap on Neuromorphic Photonics (2501.07917v2)
Abstract: This roadmap consolidates recent advances while exploring emerging applications, reflecting the remarkable diversity of hardware platforms, neuromorphic concepts, and implementation philosophies reported in the field. It emphasizes the critical role of cross-disciplinary collaboration in this rapidly evolving field.
Summary
- The paper consolidates findings from 40+ research teams to outline a comprehensive framework for photonic neural networks in AI.
- It details various implementation schemes—such as frequency and time multiplexing, and diffractive computing—to enhance parallel processing and energy efficiency.
- The roadmap emphasizes cross-disciplinary collaboration and integration to overcome challenges in scalability, error correction, and system-level analog-optical performance.
The roadmap consolidates findings from over 40 research teams to provide a comprehensive framework for advancing photonic neural networks (PNNs) in AI. It reflects the diversity in hardware platforms, neuromorphic concepts, and implementation philosophies in the field. The roadmap emphasizes the need for cross-disciplinary collaboration to advance the technology from academic studies to real-world applications.
The roadmap covers a variety of topics, including:
- embedding dimensions such as frequency, delay, and spectral embeddings
- architectures of PNNs
- methods for implementing PNN architectures in photonic hardware
- integrated photonic hardware
- realization of photonic weights and memories
- optimization of training processes for photonic neuromorphic architectures
- potential applications
Frequency multiplexed photonic neuromorphic computing leverages the frequency degree of freedom of light for information processing. Wavelength-division multiplexing (WDM) has been used since the late 1980s to multiply the capacity of optical communication channels. A similar scheme has been exploited to implement frequency multiplexed neuromorphic computing, where information is encoded in the comb line amplitudes and mixed by phase modulation, replicating the effect of untrained connections in a neural network. This principle has been employed to realize both reservoir computers (RC) and extreme learning machines (ELM). These schemes are economical in terms of hardware because multiple "neurons" are processed in parallel using the same optical circuit, and multiple networks can be operated in parallel in different frequency bands. Furthermore, using a programmable spectral filter, one can readily apply output weights directly in the optical domain. These advantages have been demonstrated in a frequency multiplexed implementation of Deep Reservoir Computing.
The field of frequency multiplexed neuromorphic information processing is in its early stage of development and the main challenges include:
- increasing the complexity of solvable tasks by using better hardware or new algorithms
- improving scalability by reducing energy consumption and footprint while increasing speed of operation
Electro-optical phase modulation requires strong RF signals to achieve good mixing (up to almost 1W), while exploiting optical nonlinearities such as the Kerr effect typically requires high optical powers. It has been shown that useful information mixing already occurs with a Kerr-induced nonlinear phase of 0.3 rad (or lower), suggesting that it may be possible to avoid the use of high optical power. Another challenge is the need for high-resolution and high-speed programmable spectral filters for manipulating light in the wavelength domain. The most common commercial technology for programmable spectral filtering is based on liquid crystals, which operates in the C and L bands and offers a minimum bandwidth of 10 GHz and a settling time of 500 ms. Integrated optics can provide high-performance frequency combs and programmable spectral filters with a density higher than the DWDM standard can be realized in integrated photonics, achieving sub-ms settling times.
Time-multiplexed reservoir computing in the optical domain uses various physical systems as reservoirs. The delayed feedback leads to a high dimensional phase space of the dynamical system and thus to a very complex transient system response in time. This response can be sampled in time and allows single node reservoirs with a high number of virtual network nodes. It is not the spatial scale that limits the network dimension but only the details of output sampling and system timescale. Efficient hyper-parameter optimization in physical RC is an issue with ongoing research. While those optimizations can easily be done in numerical simulations, the tuning of all the necessary parameters is not always possible in an optical setup. Using pre-processed data has proven to be very helpful to mitigate the hyper-parameter dependence. Specifically in delay-based RCs, the hyper-parameters delay-time T and injection rate $1/T$ determine the internal coupling topology. Historically, only two settings were chosen which were either resonant t=T or desynchronized (T=T+0). As reported in [12], a resonant delay degrades performance which was explained in [13, 14] by the dimensionality reduction of the accessible phase space.
For the utility of time-multiplexed reservoir computing in the optical domain, a key challenge lies in developing methods that yield an advantage in terms of high data processing speed and low energy consumption. A vital research question is the matching of timescales between task and reservoir. Due to the sequential nature of data injection, a higher dimensional reservoir is synonymous with slower injection rates which can be circumvented by parallelization of multiple reservoirs. Another challenge of physical implementations of RC lies in the training process because the linear regression step cannot be realized fully optically in optical RC schemes. Transferring the time-multiplexing approach to other neuromorphic computing schemes apart from RC is also an active research direction to extend the readout dimension in a large class of analog computing systems.
For realizing analog computing edge devices with a small energy footprint and with sustainable materials, new biologically inspired substrates need to be explored, as well as encoding schemes that allow for higher information density as for example by utilizing both phase and amplitude information of the light. In time-multiplexed systems, the input needs to be masked in order to guarantee a diverse response to the input, but this usually requires fast pattern generation which limits applicability for energy efficient hardware implementations.
Optical neural networks based on wavelength-division multiplexing (WDM) enable parallel input nodes and weighted synapses, with the potential for clock rates reaching tens of gigahertz. Optical computing operations based on WDM utilize multi-wavelength sources combined with weight bands or wavelength-sensitive elements, including micro-ring resonators (MRR), semiconductor optical amplifiers (SOA), and phase change memory (PCM).
Dense integration of the entire photonics system is crucial to achieve competitive computing densities compared to electrical counterparts. Hybrid integration techniques are necessary to effectively utilize optics' broad bandwidths with WDM techniques, integrating sources and multiply-and-accumulate units. Expanding the range of computing operations, including nonlinear neuron functions and Fourier transforms, on-chip is essential to enhance ONN universality for diverse machine learning tasks. This requires advancements in computing architectures tailored to specific operations and high-nonlinearity component integration enabling nonlinear functions with low optical power. Moreover, as ONNs consist of massive programmable photonic units for high spatial-division parallelism, tailored algorithms are needed to address fabrication imperfections and on-chip cross-talk, ensuring fast-converging control of on-chip elements and network training.
Integrated Optical Frequency Combs (OFCs) are crucial for creating the multi-wavelength source due to their compact design, unlike discrete laser arrays. OFCs ensure equal frequency intervals between wavelength channels, facilitating easy manipulation in the frequency domain. Integrated OFCs fall into several categories based on their physical origins: a) Kerr frequency combs, or microcombs, originate from parametric oscillation in integrated MRRs. b) Mode-locked lasers utilize gain media like Erbium-doped fiber amplifiers and mode-locking mechanisms such as saturable absorbers to produce pulsed outputs. c) Electro-optic modulators use second-order nonlinearity to introduce sidebands around an optical carrier, requiring external RF sources. A critical determinant in the generation of microcombs lies in the attainment of substantial parametric gain, directly contingent upon the third-order nonlinearity of the material substrate and the Q factor of the resonator, indicative of diminished linear and nonlinear losses.
Diffractive photonic computing units achieve the effective manipulation of large-scale photons during their complex optical field propagation by using diffractive elements. Weighted interconnections are established through interference superposition of modulated optical field between successive layers. The layered diffractive coefficients can be optimized with large-scale deep learning, allowing for efficient modulation of the optical field propagation to achieve the desired high-dimensional system mapping function. The 3D-printed diffractive photonic neural networks were originally designed with millions of neurons that had very low latency and power consumption. However, the weights were fixed once printed, which greatly limited their application areas. To overcome this limitation, the researchers used spatial light modulators (SLM) to create a large-scale neuromorphic photonics computational architecture employing reconfigurable DPU with spatio-temporal multiplexing. To improve device integration, researchers have started using diffractive optical elements (DOEs) and subwavelength meta-structures for the 3D integration of DPU.
The ideal photonic neural network architecture and system should process information with sufficient computational accuracy, reconfigurability, high integration, low power consumption, and high energy efficiency. To construct large-scale advanced neural network architectures using DPU, the critical task is to create a matrix multiplier with sufficient computational accuracy, but diffractive photonic computing is an analog computing process that suffers from accuracy issues and layer-by-layer accumulation of errors. Additionally, for the architecture to be applicable to various scenarios, the device must be reconfigurable for adaptive tuning and training, but light still experiences significant energy loss when propagating through a DPU. Generative AI often deals with tensor operations in thousands of dimensions, so it is necessary for photonic computing to be scalable and capable of handling large data throughputs. The lack of efficient optical nonlinearities restricts neural networks to being linear, which limits both their accuracy and ability to handle complex tasks.
To address the challenges for diffractive photonic computing, researchers should prioritize establishing synergy between hardware and software to coordinate the design of bottom-layer devices and top-layer architectures to improve computational accuracy and integration. Passive modules with low energy consumption and low latency are suitable for fixed tasks, whereas active modules can be utilized to program to switch between tasks. Additionally, there have been attempts at optoelectronic or all-optical nonlinearities, such as utilizing the nonlinearity of ferroelectric films or the saturable absorption effect of graphene. At a higher level, a variety of photonic computing systems and applications can be achieved through the large-scale deployment of advanced neural network architectures. To increase the network scale, joint training and error correction of the system must be urgently addressed from a software perspective such as in-situ training of photonic chips, where updated gradient values are computed directly by backlighting on the same hardware.
Analog optical computers (AOC) can potentially offer more than 100x improvement in overall system efficiency in terms of Tera Operations per Second per Watt (TOPS/Watt) at Int8 precision, as compared to state-of-the-art digital hardware, at scale. AOC uses optical and analog electronic technologies, respectively, to accelerate linear and non-linear compute primitives. AOC is not a general-purpose computer, but the exact same hardware can accelerate two computationally intensive verticals: machine learning (ML) inference and hard combinatorial optimization problems. A critical innovation area in AOC is the codesign of hardware with these application verticals.
At the hardware level, planar optical technologies offer the key advantage of component miniaturization, which is critical to scaling, but they suffer from the fundamental challenge that precious on-chip real estate is used both for computing on and routing of data. Three-dimensional (3D) optics using surface-emitting source and modulators and detectors sidesteps this challenge, thus opening the path to a step change in performance gains as compared to digital chips. AOC tackles these hardware-level challenges by combining integrated 3D optics to accelerate linear operations and analog electronics for non-linear operations. This combination means the entire computation for ML and optimization problems can be done in the analog domain without any on-path digital conversions and without an explicit clock. Ultra-high TOPS/Watt is necessary but not sufficient to accelerate the target applications. In ML, prevalent models like auto-regressive transformers are IO-bound. In optimization, there has been a lot of work on Ising machines, but their ability to efficiently accommodate real-world problems is still an unresolved challenge. On the ML front, there are emerging models, e.g., energy-based and diffusion models, that achieve excellent performance and functionality at the cost of increased operational intensity which is very conducive to the strengths and weaknesses of AOC. On the optimization front, a more expressive abstraction for hard optimization problems includes the quadratic unconstrained mixed optimisation (QUMO) abstraction that naturally maps to the hardware and is able to capture real-world optimization problems.
The technologies underlying AOC hardware have been intentionally chosen such that they have an existing manufacturing ecosystem, but micro-led-based light sources used in AOC need to improve their efficiency for the computer to achieve the target TOPS/Watts. Moreover, while optical fan-in and fan-out architectures have the potential of scaling, they still lack demonstrations at significant scales and need further technological enhancements. Also, existing silos between the algorithmic and ML experts and non-traditional hardware designers need to be broken.
Spatial photonic Ising machines (SPIMs) are a paradigm that encompasses IMs operating by spatial light modulation. Spins are represented by binary phases multiplexed in space and the intensity on the detector gives the absolute value of the Ising Hamiltonian. Computing the Ising energy is the building block of most heuristic algorithms for searching the ground state and for simulating spin systems at a finite temperature. This computation for N spins requires O(N2) MAC operations on a digital processor, while on a SPIM the result is obtained by a single intensity measurement with a computational cost O(1) almost independent of the system's size and low power consumption (mW laser light). The device operates as a photonic annealer: the Ising energy is measured and its value is used to update iteratively the spin configuration via digital feedback by using a Metropolis-Hasting algorithm. Their success probability has been enhanced by using physical noise within the experimental setup instead of random numbers generated digitally.
In SPIMs, couplings are also realized optically and controlled by spatial light modulation with 8-bit resolution. The Gauge transformation method allows Mattis-type couplings by using a single phase-only SLM, greatly simplifying the setup and enabling accurate photonic simulation of various magnetic phase transitions. Very recently, intense efforts have led to various fully programmable SPIMs based on matrix decomposition and multiplexing schemes.
The challenge is reducing the SPIM iteration time by orders of magnitude by designing dedicated algorithms that exploit further the SPIM spatial parallelism and minimize the number of iterations to reach the ground state such as genetic algorithms. Also, research aimed at overcoming the bottlenecks associated with the use of digital feedback, either through analog electronics or optical cavities, is crucial to push SPIMs into the field of ultrafast computation. The SPIM concept can be extended to realize both models with non-binary spins and higher-order Hamiltonians such as SPIMs that realize four-body interactions by second-harmonic generation in nonlinear media. The multi-level phase modulation of SLMs allows the encoding of clock and circular spins, enabling simulations of the XY Hamiltonian with programmable couplings and its topological dynamics.
A first advance towards an ultrafast SPIM is offered by upgrades in DMDs and MEMS-based SLMs, and the breakthrough will be the development of electro-optic SLMs, which promise GHz frame rates. The field proliferates interesting developments and further advantages and new possibilities are envisioned by using few-photon sources and quantum light to drive the setup.
The Extreme Learning Machine (ELM) is a computational paradigm that offers an efficient alternative to traditional neural networks and Support Vector Machine (SVM) models. ELM is based on a feed-forward neural network consisting of a single hidden layer where information is processed and sent to an output layer formed by at least a single output node. The hidden layer nonlinearly maps input signals into a higher-dimensional computational space using random weights and an infinitely differentiable nonlinear function. Training occurs exclusively in the output layer through a standard linearization process, such as the linear regression. Since ELM does not require internal- interconnection tuning, it is suitable for photonic implementations, hence the name Photonic ELM (PELM) or Optical ELM (OELM).
PELM has been demonstrated by using different platforms and input-data encoding methods such as when free-space optics is used, the encoding is performed at the input optical signal wavefront by a phase-only spatial light modulator (SLM) and the phase information is linearly self-mixed during light propagation. In photonic integrated circuits (PICs), a PELM is demonstrated by using an array of microresonators for random input space expansion and integrated microheaters to encode the input. Most implementations use the square-law of photodetection as the nonlinear function, while few leverage the intrinsic nonlinearities of the materials such as the Kerr nonlinearity of optical fiber or of atomic vapor.
Two fundamental aspects of ELM require further studies: the random expansion of spatial dimensions of the input space and the application of the nonlinear function. In bulk systems, mapping input signals into a higher-dimensional space is relatively straightforward, but expansion strategies for fibers and PICs typically involve spatially by increasing the physical size, virtually by using virtual nodes, in frequency through wavelength multiplexing techniques, in the optical field distribution by using different mode orders. These methods can be combined to create complex hyperspaces. In PICS, spatial size expansion is constrained by propagation losses. A hybrid approach, like space-wavelength multiplexing, seems to be the optimal. Moreover, existing PIC implementations only use the detector nonlinearities, and there is a lack of studies based on the inherent nonlinearity of the waveguides and on the number of nodes to ensure optimal performance. There is a lack of studies based on the inherent nonlinearity of the waveguides and on the number of nodes to ensure optimal performance. The role of the inherent nonlinearity of the material platform is another topic of research where some studies indicate limited learning capability of PELMs when light propagates in linear or weakly nonlinear systems. Remarksbly, the usual square-law nonlinearity of photodetectors is always present in the demonstrated PELMs, due to the offline data processing phase. However, the measurement of the optical field intensity yields a loss of the phase information, which in turn compresses the input space expansion. Finally, a big open challenge is the all-optical approach because optical readout would accelerate data processing and reduce the amount of data to be stored, especially with large nonlinear input mappings. Using conventional Mach-Zehnder modulators controlled by microheaters for readout and encoding hampers performance due to long thermalization times and thermal cross-talk.
Liquid crystal light modulators are versatile tools for generating arbitrary optical fields, enabling precise control over phase and amplitude. For PICS, improvements in fabrication methods, mostly hybrid approaches, could facilitate the realization of all-optical PELM. This requires the use of efficient and compact p-n junction modulators, which could mitigate issues associated with thermos-optic actuations. Electronic pre-training involves initial photodiode detection of both amplitude and phase, followed by software linearization. Furthermore, hybrid approaches based on phase change materials for non-volatile weights or two dimensional materials for large optical nonlinearities are unexplored research fields in PELM. However, the scalability of any PELM requires concurrent consideration of the system topology, an area that has not been extensively explored theoretically.
For the optical implementation of neural networks, optics can be used as an interconnect technology, for instance, to transfer data from memory to processors. Alternatively, weighted optical interconnections can realize the linear part of system. In order for the optics to become an integral part of GPU and TPU systems it must offer a strong advantage since digital technology is rapidly advancing and the software infrastructure has been developed for digital only systems. Light modulators as well as optical and electronic components can be integrated in such systems. The potential advantage of the inclusion of optics depends on the performance of the optical-electronic-optical (OEO) conversion, and if the OEO system is not sufficiently power efficient, accurate and fast, then whatever advantage we might have from the inclusion of optics can be washed away. The devices used in most laboratory demonstrations are quite different where in the 2D integrated optics approach, devices developed for fiber optics telecommunications often include high speed modulators and detectors (> 1 GHz) while 3D free space implementation spatial light modulators (SLM) and cameras developed for imaging and display applications are used. SLMs and cameras can support several million pixels (neurons) in an area approximately equal to 1 cm2, but the bandwidth of each channel (pixel) is limited to the kHz range. The approach of combining the high-speed integrated modulators with the parallelism of 2D SLMs shows promise that optical systems based on the latter can outperform all digital systems in the coming years. The power consumed for the interface between the digital computer and the optical system should be minimized. Another approach is to minimize the role of the digital system and the accompanying OEO's in the hybrid system.
Implementing Artificial Neural Networks (ANNs) using photonic technology shows great promise in terms of scalability, speed, energy efficiency and parallel information processing. A key challenge for NN hardware integration is achieving parallel interconnections scalably, which currently is limited to networks comprising around 1000 neurons. For routing networks in 2D the input and output neurons are arranged in columns and rows, and are connected by wires (electronics) or waveguides (photonics) within the space allocated in-between, forming so-called cross-bar arrays. The area for such circuitry scales as the product between the input and output neuron numbers, hence footprint scaling is quadratic with network size. Yet, expanding the implementation into a third dimension fundamentally eases this scalability conflict because, for example, input and output neurons then respectively occupy the 3D circuit's top and bottom 2D plane, while the circuit's volume can be utilized for out-of-plane interconnections. Implementing dense network connections in the 2D in-memory computing setting of an electronic cross-bar array leads to a quadratic relationship between energy consumption and the number of neurons.
Currently, optimization using error back-propagation relies on a digital-twin approach, which uses the efficient unconventional computing substrate only in the forward direction for inference. But in the backward direction one requires the derivatives of neuron activation functions to propagate, and such a symmetry breaking of f(x) and ∂x∂f(x) nonlinearity when going forward or respectively backwards through the substrates is physically forbidden in most settings. Additive manufacturing via 3D printing stands out as an innovative tool for creating intricate 3D photonic components that are compatible with CMOS.
Using spatially multiplexed modes of an injection locked large area vertical cavity surface emitting laser (LA-VCSEL), a photonic neural network (PNN) can be built where all components are realized in hardware using off-the-shelf, commercially available, low energy consumption components. The system reached >98% accuracy in 6-bit header recognition tasks and promising initial results for the MNIST hand written digit recognition dataset, and the system performs classification at a high inference bandwidth of 15 kHz. Finally, the concept also demonstrated the model free optimization of photonic NNs, with the potential benefits in terms of energy efficiency and reduced scaling of the optimization energy overhead.
Follow-up Questions
We haven't generated follow-up questions for this paper yet.
Related Papers
- Neuromorphic Intermediate Representation: A Unified Instruction Set for Interoperable Brain-Inspired Computing (2023)
- A Survey of Neuromorphic Computing and Neural Networks in Hardware (2017)
- 2022 Roadmap on Neuromorphic Computing and Engineering (2021)
- Special Session: Neuromorphic hardware design and reliability from traditional CMOS to emerging technologies (2023)
- Roadmap to Neuromorphic Computing with Emerging Technologies (2024)