Reprogrammable Electro-Optic Nonlinear Activation Functions for Optical Neural Networks (1903.04579v2)

Published 12 Mar 2019 in eess.SP, cs.NE, and physics.optics

Abstract: We introduce an electro-optic hardware platform for nonlinear activation functions in optical neural networks. The optical-to-optical nonlinearity operates by converting a small portion of the input optical signal into an analog electric signal, which is used to intensity-modulate the original optical signal with no reduction in processing speed. Our scheme allows for complete nonlinear on-off contrast in transmission at relatively low optical power thresholds and eliminates the requirement of having additional optical sources between each layer of the network. Moreover, the activation function is reconfigurable via electrical bias, allowing it to be programmed or trained to synthesize a variety of nonlinear responses. Using numerical simulations, we demonstrate that this activation function significantly improves the expressiveness of optical neural networks, allowing them to perform well on two benchmark machine learning tasks: learning a multi-input exclusive-OR (XOR) logic function and classification of images of handwritten numbers from the MNIST dataset. The addition of the nonlinear activation function improves test accuracy on the MNIST task from 85% to 94%.

Citations (190)

View on Semantic Scholar

Summary

The paper presents a reprogrammable electro-optic nonlinear activation that converts part of the optical input into an electrical signal to modulate the overall transmission.
The study demonstrates that integrating this activation function significantly enhances ONN performance, improving MNIST classification accuracy from 85% to 94%.
The proposed architecture offers advantages in low power consumption, scalability, and speed, paving the way for future advances in optical computing.

Reprogrammable Electro-Optic Nonlinear Activation Functions for Optical Neural Networks

The paper presents a novel electro-optic hardware platform designed to implement nonlinear activation functions in optical neural networks (ONNs). This platform achieves optical-to-optical nonlinearity by converting a fraction of the optical input into an electrical signal. This electrical signal subsequently modulates the optical signal's intensity. Such an approach ensures that processing speed remains uncompromised and obviates the necessity for additional optical sources between network layers.

Key Contributions

Nonlinear Activation Architecture: The paper introduces an innovative activation function that allows for complete nonlinear on-off contrast in transmission at relatively low power thresholds. The approach uses an electro-optic mechanism whereby a part of the optical input is sensed and transformed into an electrical signal, which then modulates the remaining optical signal. This method results in an activation function that can be reconfigured through electrical bias, enabling a wide array of nonlinear responses.
Implementation and Results: Using numerical simulations, it is demonstrated that the proposed activation function significantly enhances the expressiveness of ONNs. The paper showcases application efficacy in two benchmark tasks: multi-input XOR logic function learning and MNIST dataset image classification. Notably, the inclusion of this nonlinear activation function improved average MNIST task accuracy from 85% to 94%.
Performance Evaluation: The paper evaluates the potential of the proposed architecture in terms of power consumption, latency, processing speed, and physical footprint, detailing the scaling based on network dimension and layer depth. The system also presents reduced power consumption possibilities when leveraging efficient, integrated optical components.

Theoretical and Practical Implications

The proposed architecture addresses key challenges associated with ONNs, particularly the implementation of nonlinear activation functions which are essential for learning complex mappings. Unlike conventional optical nonlinearities, which are often weak and static, the electro-optic activation function provides robust reconfigurability and operability at lower thresholds.

Practically, this means that ONNs can be applied more effectively across a broader spectrum of machine learning tasks, potentially rivaling electronic neural networks in both speed and efficiency. This development offers significant implications for fields requiring rapid, energy-efficient computation, such as real-time image processing and signal classification.

Future Directions

Given the ability of the electro-optic activation functions to synthesize low-threshold nonlinearities, future research could explore the extensibility of this architecture in deeper neural networks and other aspects of optical computing. Additionally, integration techniques could be refined to reduce power consumption further, possibly exploring quantum-enhanced photonic circuitry for even greater performance benefits.

In conclusion, this paper contributes an important advancement in optical computing, presenting a viable solution to the implementation of nonlinear activation functions in optical neural networks. The proposed architecture could be foundational for future developments in high-speed, energy-efficient computational systems.

PDF Markdown