Revisiting Batch Normalization for Training Low-latency Deep Spiking Neural Networks from Scratch (2010.01729v5)

Published 5 Oct 2020 in cs.CV, cs.AI, and cs.NE

Abstract: Spiking Neural Networks (SNNs) have recently emerged as an alternative to deep learning owing to sparse, asynchronous and binary event (or spike) driven processing, that can yield huge energy efficiency benefits on neuromorphic hardware. However, training high-accuracy and low-latency SNNs from scratch suffers from non-differentiable nature of a spiking neuron. To address this training issue in SNNs, we revisit batch normalization and propose a temporal Batch Normalization Through Time (BNTT) technique. Most prior SNN works till now have disregarded batch normalization deeming it ineffective for training temporal SNNs. Different from previous works, our proposed BNTT decouples the parameters in a BNTT layer along the time axis to capture the temporal dynamics of spikes. The temporally evolving learnable parameters in BNTT allow a neuron to control its spike rate through different time-steps, enabling low-latency and low-energy training from scratch. We conduct experiments on CIFAR-10, CIFAR-100, Tiny-ImageNet and event-driven DVS-CIFAR10 datasets. BNTT allows us to train deep SNN architectures from scratch, for the first time, on complex datasets with just few 25-30 time-steps. We also propose an early exit algorithm using the distribution of parameters in BNTT to reduce the latency at inference, that further improves the energy-efficiency.

Authors (2)

Youngeun Kim (48 papers)
Priyadarshini Panda (104 papers)

Citations (162)

View on Semantic Scholar

Summary

Insights on Temporal Batch Normalization for Training Spiking Neural Networks

The paper presents an incisive exploration of Spiking Neural Networks (SNNs) and proposes a novel approach in utilizing Batch Normalization, specifically a technique coined as Batch Normalization Through Time (BNTT). SNNs, characterized by their sparse, asynchronous, and binary event-driven processing, offer promising energy efficiency gains when deployed on neuromorphic hardware. However, inherent difficulties in training due to the non-differentiable nature of spiking neurons pose significant challenges, particularly when aiming for high accuracy and low latency.

Temporal Batch Normalization Through Time (BNTT)

BNTT emerges as a crucial development to address the optimization challenges in SNNs. Traditional batch normalization in ANN training does not effectively capture the temporal dynamics inherent in SNNs, as standard BN layers treat all time steps uniformly without accounting for the evolving spike patterns. BNTT innovatively decouples the parameters along the time axis, adapting to the spike rate changes over time, thus better modeling the spike dynamics. By employing time-specific learnable parameters, BNTT enables efficient training of SNNs on complex datasets with significantly reduced time-steps, marking a substantial advance from the previously inefficient methods such as ANN-SNN conversion or surrogate gradient descent.

Methodology and Experimentation

The methodology integrates BNTT within a surrogate gradient descent framework and tests its efficacy on multiple datasets including CIFAR-10, CIFAR-100, Tiny-ImageNet, and DVS-CIFAR10. BNTT enables the direct and stable training of deep SNN architectures from scratch, achieving notable performance improvements while reducing latency and energy consumption. Furthermore, the paper introduces an innovative early exit algorithm that utilizes the temporal distribution of BNTT parameters to minimize inference latency further by determining optimal exit points based on the computed threshold values of learnable parameters.

Numerical Results and Implications

The results indicate that BNTT achieves competitive accuracy—surpassing some ANN-based methods—while operating within a drastically reduced temporal framework. For instance, BNTT facilitates training on CIFAR-10 with merely 25-30 time-steps, a stark reduction when compared to conversion methods that often require over 500 to 1000 time-steps. Moreover, this approach yields approximately a 9-fold increase in energy efficiency compared to conventional ANNs. The paper also highlights robust performance in noisy conditions and greater adversarial resistance, suggesting broad applicability in resource-constrained environments such as IoT devices.

Theoretical and Practical Implications

From a theoretical standpoint, BNTT provides insights into the nuanced control of spike dynamics using temporal parameter adaptation, resembling adjusting firing thresholds at each time-step. Practically, this approach signifies a leap towards deploying SNNs in real-world applications where energy efficiency and processing speed are paramount. Future research could delve into optimizing BNTT further, examining its performance across more diverse SNN architectures or exploring its integration with neuromorphic hardware systems for edge computing applications.

Conclusion

This paper marks a meaningful advancement in the training methodologies for low-latency and energy-efficient SNNs, offering innovative techniques with BNTT that markedly enhance performance while addressing long-standing challenges in SNN optimization. By enabling robust training of deep networks from scratch, BNTT stands to influence future research trajectories and enhance the practical implementations of SNNs in varied technological domains.