Attention Spiking Neural Networks (2209.13929v1)

Published 28 Sep 2022 in cs.CV

Abstract: Benefiting from the event-driven and sparse spiking characteristics of the brain, spiking neural networks (SNNs) are becoming an energy-efficient alternative to artificial neural networks (ANNs). However, the performance gap between SNNs and ANNs has been a great hindrance to deploying SNNs ubiquitously for a long time. To leverage the full potential of SNNs, we study the effect of attention mechanisms in SNNs. We first present our idea of attention with a plug-and-play kit, termed the Multi-dimensional Attention (MA). Then, a new attention SNN architecture with end-to-end training called "MA-SNN" is proposed, which infers attention weights along the temporal, channel, as well as spatial dimensions separately or simultaneously. Based on the existing neuroscience theories, we exploit the attention weights to optimize membrane potentials, which in turn regulate the spiking response in a data-dependent way. At the cost of negligible additional parameters, MA facilitates vanilla SNNs to achieve sparser spiking activity, better performance, and energy efficiency concurrently. Experiments are conducted in event-based DVS128 Gesture/Gait action recognition and ImageNet-1k image classification. On Gesture/Gait, the spike counts are reduced by 84.9%/81.6%, and the task accuracy and energy efficiency are improved by 5.9%/4.7% and 3.4$\times$/3.2$\times$. On ImageNet-1K, we achieve top-1 accuracy of 75.92% and 77.08% on single/4-step Res-SNN-104, which are state-of-the-art results in SNNs. To our best knowledge, this is for the first time, that the SNN community achieves comparable or even better performance compared with its ANN counterpart in the large-scale dataset. Our work lights up SNN's potential as a general backbone to support various applications for SNNs, with a great balance between effectiveness and efficiency.

PDF Abstract

Attention Spiking Neural Networks: A Path to Energy-Efficient Intelligence

The paper "Attention Spiking Neural Networks" presents a significant exploration into the integration of attention mechanisms within Spiking Neural Networks (SNNs), aiming to close the performance gap that has historically separated SNNs from the more resource-intensive Artificial Neural Networks (ANNs). The authors propose a Multi-dimensional Attention (MA) module that augments SNNs by enabling them to focus dynamically on essential spatio-temporal information, which makes these networks more efficient and practical for real-world applications compared to traditional ANNs, especially in terms of energy consumption.

Key Contributions and Findings

The MA module introduced in this paper is a versatile component that can be seamlessly incorporated into existing SNN architectures. Its design is strongly inspired by neural mechanisms in the biological brain, particularly the modulation of synaptic activity rates by attention processes. The authors propose a new network architecture called the MA-SNN that leverages this module to infer attention weights over the temporal, channel, and spatial dimensions. Notably, the integration of this attention mechanism improves the intrinsic efficiency of spiking neurons by optimally regulating the membrane potentials in a data-dependent manner.

Significant empirical evidence is provided regarding the superior performance of these attention-augmented SNNs across various benchmark tasks, including gesture and gait recognition using event-based data from the DVS128 dataset and image classification on the large-scale ImageNet-1K dataset. The results are compelling: on the DVS128 dataset, spike counts are reduced by over 80% while accuracy and energy efficiency are markedly increased. On ImageNet-1K, the MA-SNN achieves a top-1 accuracy of 77.08% with exceptional energy efficiency gains over traditional ANN counterparts. This marks a noteworthy achievement as these SNNs achieve competitive, and occasionally superior, performance compared to their ANN equivalents for the first time in large datasets.

The paper also explores the theoretical underpinnings of the MA-SNN's performance enhancements. The inclusion of a block dynamical isometry theory elucidates how the often problematic gradient vanishing or spiking degradation issues in deep SNNs can be mitigated through attention mechanisms. This theoretical advancement is supported by a novel spiking response visualization technique, which demonstrates how the MA module achieves its efficiency gains by fostering sparser neuronal firing where essential, directly emulating the brain's natural processes.

Implications and Future Directions

This work demonstrates a significant stride towards resolving the longstanding efficiency and performance challenges associated with deploying SNNs over ANNs. The implications are broad, particularly concerning the deployment of neural networks in environments constrained by energy and computational resources, such as mobile devices and IoT applications.

The authors propose that their work illuminates the potential for SNNs to serve as a general backbone in neuromorphic computing applications, promoting an effective blend of performance and efficiency. Future developments could include exploring deeper and more complex SNN architectures using MA modules or extending these findings into additional applications like natural language processing or real-time data streaming tasks.

In sum, this paper makes valuable contributions to both the theoretical understanding and practical capabilities of SNNs, advocating for the integration of biologically-inspired attention mechanisms as a path forward in developing energy-efficient neuromorphic computing systems.

PDF Markdown Bookmark Chat (Pro)

Authors (8)

Man Yao (18 papers)
Guangshe Zhao (7 papers)
Hengyu Zhang (14 papers)
Yifan Hu (89 papers)
Lei Deng (81 papers)
Yonghong Tian (184 papers)
Bo Xu (212 papers)
Guoqi Li (90 papers)

Citations (122)

View on Semantic Scholar

Attention Spiking Neural Networks (2209.13929v1)

Attention Spiking Neural Networks: A Path to Energy-Efficient Intelligence

Key Contributions and Findings

Implications and Future Directions

Related Papers