Spike-driven Transformer (2307.01694v1)

Published 4 Jul 2023 in cs.NE and cs.CV

Abstract: Spiking Neural Networks (SNNs) provide an energy-efficient deep learning option due to their unique spike-based event-driven (i.e., spike-driven) paradigm. In this paper, we incorporate the spike-driven paradigm into Transformer by the proposed Spike-driven Transformer with four unique properties: 1) Event-driven, no calculation is triggered when the input of Transformer is zero; 2) Binary spike communication, all matrix multiplications associated with the spike matrix can be transformed into sparse additions; 3) Self-attention with linear complexity at both token and channel dimensions; 4) The operations between spike-form Query, Key, and Value are mask and addition. Together, there are only sparse addition operations in the Spike-driven Transformer. To this end, we design a novel Spike-Driven Self-Attention (SDSA), which exploits only mask and addition operations without any multiplication, and thus having up to $87.2\times$ lower computation energy than vanilla self-attention. Especially in SDSA, the matrix multiplication between Query, Key, and Value is designed as the mask operation. In addition, we rearrange all residual connections in the vanilla Transformer before the activation functions to ensure that all neurons transmit binary spike signals. It is shown that the Spike-driven Transformer can achieve 77.1\% top-1 accuracy on ImageNet-1K, which is the state-of-the-art result in the SNN field. The source code is available at https://github.com/BICLab/Spike-Driven-Transformer.

PDF HTML Abstract

Spike-Driven Transformer: A Synergy of Energy Efficiency and Performance

The field of neural network architecture has continuously evolved, striving for an optimal balance between computational power, energy efficiency, and task accuracy. This paper introduces a novel architectural innovation, the Spike-driven Transformer, which integrates the spike-driven nature of Spiking Neural Networks (SNNs) into the robust framework of Transformers. This work is a significant contribution to the field, addressing the long-standing challenge of incorporating SNNs' low-power, bio-inspired paradigm into high-performance neural networks like Transformers.

Core Contributions

The Spike-driven Transformer distinguishes itself through four unique properties:

Event-Driven Computation: The spike-driven paradigm ensures that no unnecessary computation is performed when input values are zero, enhancing energy efficiency significantly.
Binary Spike Communication: By leveraging binary spikes for communication, conventional matrix multiplications are transformed into sparse additions, further reducing computational load.
Linear Complexity Self-Attention: The self-attention mechanism, a cornerstone of Transformer efficiency, is adapted to linear complexity across both token and channel dimensions, offering substantial computational savings.
Sparse Operation Between Spike Form Inputs: Operations between spike-form Query, Key, and Value utilize mask and addition, eliminating the need for energy-intensive multiplication.

Central to the Spike-driven Transformer is the novel Spike-Driven Self-Attention (SDSA) mechanism, which eliminates multiplication entirely in favor of computationally efficient mask and addition operations. This adjustment achieves up to 87.2 times lower computational energy than traditional self-attention, exemplifying the model's energy-efficient nature.

Methodological Innovation

A noteworthy aspect of the proposed model is the rearrangement of residual connections prior to activation functions. This ensures the communication between neurons is strictly through binary spikes, aligning closely with the spike-driven paradigm and minimizing energy consumption. With these adjustments, the Spike-driven Transformer achieves a Top-1 accuracy of 77.1% on ImageNet-1K, marking it as a state-of-the-art solution within the SNN category.

Energy Efficiency Analysis

The paper presents an in-depth energy analysis, highlighting the drastic reduction in power consumption possible with the spike-driven paradigm. Traditional Transformer architectures perform computationally expensive Multiply-and-Accumulate (MAC) operations, whereas the Spike-driven Transformer capitalizes on Accumulate (AC) operations, decreasing energy usage simultaneously with maintaining competitive accuracy.

Implications and Future Directions

The implications of this research extend into both practical and theoretical domains. Practically, the Spike-driven Transformer could notably reduce the energy footprint of neural networks deployed on resource-constrained devices. Theoretically, it presents a compelling case for revisiting how bio-inspired principles can inform scalable, efficient network designs.

Future developments in AI could focus on enhancing the compatibility of more complex neural operations with the spike-driven approach, furthering the goal of universally integrating energy efficiency with computational efficacy. Additionally, exploring the deployment of the Spike-driven Transformer on neuromorphic hardware could yield intriguing insights into hardware-optimized deep learning frameworks.

In conclusion, the Spike-driven Transformer represents a significant advancement in low-power neural network design, effectively marrying the energy efficiency of SNNs with the high performance of Transformers. This work not only enriches the toolkit of AI researchers with a novel architectural option but also sets a precedent for future explorations in energy-efficient AI infrastructures.

PDF Markdown Bookmark Chat (Pro)

References (94)

Authors (7)

Man Yao (18 papers)
Jiakui Hu (11 papers)
Zhaokun Zhou (22 papers)
Li Yuan (141 papers)
Yonghong Tian (184 papers)
Bo Xu (212 papers)
Guoqi Li (90 papers)

Citations (61)

View on Semantic Scholar

GitHub

GitHub - BICLab/Spike-Driven-Transformer: Offical implementation of "Spike-driven Transformer" (NeurIPS2023) (260 stars)