Direct Training High-Performance Deep Spiking Neural Networks: A Review of Theories and Methods (2405.04289v2)

Published 6 May 2024 in cs.NE

Abstract: Spiking neural networks (SNNs) offer a promising energy-efficient alternative to artificial neural networks (ANNs), in virtue of their high biological plausibility, rich spatial-temporal dynamics, and event-driven computation. The direct training algorithms based on the surrogate gradient method provide sufficient flexibility to design novel SNN architectures and explore the spatial-temporal dynamics of SNNs. According to previous studies, the performance of models is highly dependent on their sizes. Recently, direct training deep SNNs have achieved great progress on both neuromorphic datasets and large-scale static datasets. Notably, transformer-based SNNs show comparable performance with their ANN counterparts. In this paper, we provide a new perspective to summarize the theories and methods for training deep SNNs with high performance in a systematic and comprehensive way, including theory fundamentals, spiking neuron models, advanced SNN models and residual architectures, software frameworks and neuromorphic hardware, applications, and future trends. The reviewed papers are collected at https://github.com/zhouchenlin2096/Awesome-Spiking-Neural-Networks

PDF Abstract

Overview of Direct Training High-Performance Deep Spiking Neural Networks

Spiking Neural Networks (SNNs), known as the third generation of neural networks, harness event-driven computation, high energy efficiency, and biological plausibility, making them a promising alternative to traditional ANNs. This comprehensive review focuses on the direct training methods for high-performance deep SNNs, offering insights into various training algorithms and architectural advancements within the domain.

Spiking Neuron Models

The foundational Leaky Integrate-and-Fire (LIF) model serves as a cornerstone within SNNs due to its simplicity. However, its limitations necessitate more advanced variants such as PLIF, GLIF, and KLIF, which incorporate trainable parameters like membrane time constants or dynamic thresholds. These modifications aim to improve SNNs' performance by broadening their representation capabilities and adaptively tuning neuronal sensitivities. Additionally, parallel spiking neurons are introduced to accelerate computation through parallelization, addressing the high computational load of traditional step-by-step time-dependent models.

Information Encoding and SNN Training

The paper discusses information encoding strategies, focusing on the rate and temporal coding methods. Time-to-first-spike (TTFS), under temporal coding, is highlighted for its efficiency in reducing spikes and preserving information fidelity. Surrogate gradient methods have emerged as pivotal for handling the non-differentiability of spiking neurons, utilizing differentiable approximations in backpropagation. Moreover, various loss functions, including IM-Loss and RMP-Loss, are explored to optimize network performance by enhancing gradient flow and membrane potential distribution.

Architectural Developments

Recent advancements include transformer-based SNNs and enhanced residual learning structures. Transformer architectures, leveraging self-attention mechanisms, facilitate global spatial information capture, demonstrating significant performance improvements. Spiking Transformers (e.g., Spikformer) have achieved high accuracy with fewer time steps, offering an effective integration of transformer benefits into SNNs. As for residual architectures, Activation-before-addition and Pre-activation shortcuts are introduced to alleviate degradation and vanishing gradient issues, fostering deeper networks while minimizing costly non-spike computations.

Software Frameworks and Neuromorphic Hardware

The review highlights the growth and challenges in software frameworks like SNNTorch and SpikingJelly, which enable model building, training, and deploying SNNs on varied platforms. Neuromorphic hardware, with designs like Loihi and Tianjic, are pivotal for energy-efficient implementations, although they currently focus on inference rather than large-scale training. The paper calls for collaboration between hardware and software developers to enhance efficiency and interoperability in neuromorphic systems.

Applications and Future Directions

The utility of deep SNNs spans across computer vision, reinforcement learning, and autonomous robotics, with transformative implications for areas necessitating low-latency, energy-efficient processing. Despite their potential, the authors advocate further exploration of biologically inspired learning rules, training efficiencies, and application-specific optimizations to realize SNNs’ full capabilities.

In summary, this review synthesizes the state-of-the-art techniques in directly training SNNs, covering theoretical advancements, architectural innovations, and applications, and outlines future challenges and opportunities for further refinement of these models, emphasizing the need for robust interdisciplinary efforts to advance the field.