- The paper introduces a supervised training method for convolutional spiking neural networks using surrogate gradients to address the non-differentiable spike function.
- It employs convolutional layers with fast horizontal connections in a LIF neuron model, enabling efficient backpropagation through time in PyTorch.
- Experiments on speech recognition benchmarks achieved approximately 94% accuracy with a mean firing rate of 5Hz, highlighting energy efficiency and practical viability.
Supervised Training of Convolutional Spiking Neural Networks with PyTorch
This document describes an extension of the supervised training approach for spiking neural networks (SNNs), focusing on convolutional layers and employing PyTorch for implementation. The authors advance the training methodology by incorporating convolutional operations and integrating fast horizontal connections in the SNN architecture, potentially enhancing performance for specific tasks, such as speech recognition.
Background and Motivation
The paper begins with an exploration of the biological inspiration behind SNNs, citing the leaky integrate-and-fire (LIF) model as a common basis for simulating the dynamics of spiking neurons. Unlike conventional artificial neural networks (ANNs), which rely on continuous-valued activations, SNNs employ discrete spikes, mimicking the communication method of biological neurons. This spike-based computation not only offers biological realism but also enhances energy efficiency, especially when implemented on neuromorphic hardware.
Given the hardware advantages and the need for efficient spike-based computation, the authors endeavor to leverage SNNs to tackle real-world tasks traditionally addressed with ANNs. Despite numerous neural models available, the transition from theoretical models to practical applications in machine learning scenarios remains challenging. This work attempts to bridge this gap by extending spiking neural network models with convolutional layers, optimized through backpropagation with surrogate gradients.
Methodology
The authors start by adapting feed-forward and convolutional neural network architectures within the SNN framework. To deal with the non-differentiable spike activation function, they utilize surrogate gradients. Specifically, a sigmoid function approximates the Heaviside step function, facilitating backpropagation through time (BPTT) for training.
Moreover, the authors introduce a multi-dimensional send-on-delta (SoD) coding scheme implemented through a network of spiking neurons with lateral connections. This method tracks changes in signal direction rather than individual dimensions, which potentially makes it more effective for inputs with correlated features.
For detailed control of temporal dynamics, they propose using LIF neurons modeled as recurrent neural network (RNN) cells in their simulation, allowing efficient computation in discrete-time steps using PyTorch.
Results
Experiments were performed using the Google Speech Commands dataset as a benchmark. By utilizing a spiking convolutional neural network with the aforementioned coding and training strategies, the model achieved approximately 94% accuracy on the task, which approaches the performance of state-of-the-art deep learning models. Notably, this result was accomplished with a mean firing rate of approximately 5Hz, emphasizing the sparsity and efficiency of the network.
Implications and Future Directions
The strong performance of SNNs on speech recognition tasks, using an energy-efficient architectural design, highlights the viability of SNNs as a competitive alternative to traditional ANNs. The results suggest that the adaptations made for convolutional spiking layers, coupled with the utilization of surrogate gradients, successfully narrow the gap between theoretical potential and practical application.
Future work should focus on testing the scalability and adaptability of these SNN models across diverse datasets, such as vision tasks involving event-based sensors like dynamic vision sensors. Additionally, further research could explore the interplay between neuromorphic hardware implementations and such advanced SNN architectures to fully leverage the energy efficiency promised by spiking neural computation. The integration of event-based sampling theories also presents a promising avenue for investigation, potentially leading to richer representations and processing capabilities in SNNs.