Deep Residual Learning in Spiking Neural Networks (2102.04159v6)

Published 8 Feb 2021 in cs.NE

Abstract: Deep Spiking Neural Networks (SNNs) present optimization difficulties for gradient-based approaches due to discrete binary activation and complex spatial-temporal dynamics. Considering the huge success of ResNet in deep learning, it would be natural to train deep SNNs with residual learning. Previous Spiking ResNet mimics the standard residual block in ANNs and simply replaces ReLU activation layers with spiking neurons, which suffers the degradation problem and can hardly implement residual learning. In this paper, we propose the spike-element-wise (SEW) ResNet to realize residual learning in deep SNNs. We prove that the SEW ResNet can easily implement identity mapping and overcome the vanishing/exploding gradient problems of Spiking ResNet. We evaluate our SEW ResNet on ImageNet, DVS Gesture, and CIFAR10-DVS datasets, and show that SEW ResNet outperforms the state-of-the-art directly trained SNNs in both accuracy and time-steps. Moreover, SEW ResNet can achieve higher performance by simply adding more layers, providing a simple method to train deep SNNs. To our best knowledge, this is the first time that directly training deep SNNs with more than 100 layers becomes possible. Our codes are available at https://github.com/fangwei123456/Spike-Element-Wise-ResNet.

Authors (6)

Wei Fang (98 papers)
Zhaofei Yu (61 papers)
Yanqi Chen (9 papers)
Tiejun Huang (130 papers)
Timothée Masquelier (42 papers)
Yonghong Tian (184 papers)

Citations (393)

View on Semantic Scholar

Summary

Deep Residual Learning in Spiking Neural Networks

This paper presents a refined approach to training deep Spiking Neural Networks (SNNs) by introducing a novel architecture called Spike-Element-Wise (SEW) ResNet. The SEW ResNet addresses significant limitations in previous Spiking ResNet models, which adopted the architecture of Artificial Neural Networks (ANNs) but failed to effectively implement identity mapping due to the discrete and complex nature of SNN activations. The SEW ResNet aligns with the principles of residual learning, originally demonstrated by He et al. (2015), improving the training efficiency and accuracy of SNNs on complex datasets.

Analysis of Spiking ResNet Drawbacks

The paper identifies two critical drawbacks in the traditional Spiking ResNet architecture. First, it elucidates that Spiking ResNets struggle to achieve identity mapping across all neuron models because they rely on discrete spike activations, unlike the continuous outputs managed by ReLU in ANNs. This oversight leads to deeper networks performing poorly (degradation problem) due to an inability to maintain identity mapping as the depth increases.

Second, Spiking ResNets exhibit vanishing/exploding gradient problems, attributed to the compounding effect of gradients across layers in deep networks. The binary nature of spikes further complicates gradient calculations, as gradient magnitudes can cascade into extreme values or dissipate entirely with depth—a challenge exacerbated by the discrete nature of spike-based neuronal dynamics.

Introduction of the SEW ResNet

The authors propose the SEW ResNet to counter these issues. This architecture facilitates residual learning in SNNs by leveraging element-wise functions—ADD, AND, and IAND—that process spikes without degrading into non-informative signals. Each function allows the network to engage effectively with identity mapping, thereby stabilizing gradient propagation across layers and mitigating the gradient-related issues inherent in prior models. Notably, element-wise addition (ADD) permits quick transmission and adjustment of gradient magnitudes, which is verified experimentally to offer robust gradient flow even in deeply stacked layers.

Empirical Evaluation

Empirical results show that SEW ResNets outperform existing directly trained SNNs in both classification accuracy and reduced simulation time. On the ImageNet dataset, SEW ResNets demonstrated improved training accuracy and reduced degradation compared to traditional Spiking ResNets, particularly in deeper architectures involving more than 100 layers. This performance is attributed to the successful residual learning enabled by SEW Blocks, which act as enhancers of gradient flow through effective identity mapping.

Furthermore, on neuromorphic datasets like the DVS Gesture and CIFAR10-DVS, SEW ResNets again surpassed the accuracy of state-of-the-art SNN methods, achieving higher performance with fewer parameters and simulation steps. This result underscores the efficiency of SEW blocks in handling temporal data characterized by event-driven patterns, which is a prominent advantage of SNNs over ANNs.

Implications and Future Directions

This research has profound implications for advancing SNN architecture, bridging the performance gap between SNNs and ANNs in complex tasks. SEW ResNet offers a viable path towards scalable deep learning in spiking scenarios, promoting further exploration of biologically inspired computational models with lower energy consumption than standard ANNs.

The SEW approach sets a precedent for future research on residual learning paradigms in SNNs. By showing that SNNs benefit from deep residual frameworks, similar to their ANN counterparts, this work potentially paves the way for advanced applications of SNNs in both static and dynamic data processing, from neuromorphic computing to energy-efficient AI.

In conclusion, the advancement detailed in the paper could drive forward the development of neuromorphic computing and offer a foundation upon which further innovative SNN architectures can be built, exploring more aggressive or specialized residual linkage configurations.

PDF Markdown

Related Papers

GitHub

GitHub - fangwei123456/Spike-Element-Wise-ResNet: Deep Residual Learning in Spiking Neural Networks (139 stars)