Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation (2506.23717v1)

Published 30 Jun 2025 in cs.NE, cs.AI, cs.CV, and cs.LG

Abstract: Multi-bit spiking neural networks (SNNs) have recently become a heated research spot, pursuing energy-efficient and high-accurate AI. However, with more bits involved, the associated memory and computation demands escalate to the point where the performance improvements become disproportionate. Based on the insight that different layers demonstrate different importance and extra bits could be wasted and interfering, this paper presents an adaptive bit allocation strategy for direct-trained SNNs, achieving fine-grained layer-wise allocation of memory and computation resources. Thus, SNN's efficiency and accuracy can be improved. Specifically, we parametrize the temporal lengths and the bit widths of weights and spikes, and make them learnable and controllable through gradients. To address the challenges caused by changeable bit widths and temporal lengths, we propose the refined spiking neuron, which can handle different temporal lengths, enable the derivation of gradients for temporal lengths, and suit spike quantization better. In addition, we theoretically formulate the step-size mismatch problem of learnable bit widths, which may incur severe quantization errors to SNN, and accordingly propose the step-size renewal mechanism to alleviate this issue. Experiments on various datasets, including the static CIFAR and ImageNet and the dynamic CIFAR-DVS and DVS-GESTURE, demonstrate that our methods can reduce the overall memory and computation cost while achieving higher accuracy. Particularly, our SEWResNet-34 can achieve a 2.69\% accuracy gain and 4.16$\times$ lower bit budgets over the advanced baseline work on ImageNet. This work will be fully open-sourced.

Summary

The paper presents an adaptive bit allocation method that learns bit widths and temporal lengths to optimize SNN performance while lowering resource demands.
Experimental results show a 2.69% accuracy gain on ImageNet and a 4.16x reduction in bit budget, validating the method's efficiency.
The refined spiking neuron model reduces quantization errors and paves the way for integration with neuromorphic hardware for real-time processing.

Overview of "Towards Efficient and Accurate Spiking Neural Networks via Adaptive Bit Allocation"

This paper presents an adaptive optimization technique for enhancing Spiking Neural Networks (SNNs) by allocating bits efficiently across different layers. This method aims to reduce memory and computation demands while maintaining or improving accuracy. The key strategy involves making temporal lengths and the bit widths of weights and spikes learnable, thus enabling sophisticated allocation of resources.

Adaptive Bit Allocation Strategy

Parametrization

The paper introduces a framework wherein temporal lengths ( $T_l$ ), spike bit widths ( $B_{s,l}$ ), and weight bit widths ( $B_{w,l}$ ) become learnable parameters. These parameters are bounded and optimized via backpropagation. The equations for parametrization are:

$B_{s,l} = \left \lfloor clip(\hat{B}_{s,l},1,B_{s,bound}) \right \rceil, \quad T_{l} = \left \lfloor clip(\hat{T}_{l},1,T_{bound}) \right \rceil, \quad B_{w,l} = \left \lfloor clip(\hat{B}_{w,l},1,B_{w,bound}) \right \rceil$

This approach allows for fine-tuned allocation of bit widths matching the specific requirements of different layers, minimum overhead, and improvement in network performance.

Formulation of the Refined Parametric Spiking Neuron

With learnable parameters, the conventional formulation was insufficient. A refined spiking neuron is proposed, modeled with a spiking neuron equation reformulated to allow rounding of membrane potentials instead of flooring, potentially reducing quantization errors.

$S^t_{out,l}= clip\left(\left \lfloor \frac{v^t_l}{V^{1,t}_{th, l}+V^{2,t}_{th, l}} \right \rfloor, 0, 2^{B^t_{s,l}-1} \right)$

This refined model includes a voltage threshold shift $V^2_{th, l}$ to accommodate rounding operations.

Figure 1: Overview of the proposed bit allocation method. Green notations denote the parametrized constants.

Experimental Evaluation

Results on CIFAR and ImageNet

The proposed network was tested across several datasets and architectures including ResNet-based networks on CIFAR and ImageNet. Notably, the models achieved competitive accuracy with significantly reduced memory requirements and computational effort as measured by the metrics Bit Budget and S-ACE.

Figure 2: Comparisons with other advanced direct-trained SNNs using ResNet-based architectures on CIFAR100 and ImageNet-1k. Our models maintain the same level of model size as our baselines.

Performance Metrics and Efficiency Gains

Experimentation demonstrated that the model could achieve a 2.69% accuracy gain on ImageNet with a 4.16x bit budget reduction compared to baselines. Such improvements were consistent across multiple datasets, validating the efficacy of the learning-based bit allocation strategy.

Theoretical and Practical Implications

Mitigation of Step-Size Mismatch

The model also introduces a renewal mechanism to address the step-size mismatch issue which arises when bit widths change dynamically during training. By recognizing and correcting these mismatches, the method ensures robust training of spiking networks without significant loss in accuracy.

Figure 3: Average bit width changes of ResNet-20 and Spikformer on CIFAR-10 during the adaptive-bit-width training. Tar. abbreviates target bit width. W and S denote weight and spike bit width, respectively.

Future Directions

The adaptive approach underscores the potential for greater resource efficiency in computational neuroscience applications. Future considerations include integrating this approach into neuromorphic hardware platforms, thereby reducing the energy consumption and enhancing the real-time processing capabilities of SNNs.

Conclusion

Overall, the work provides a compelling framework for optimizing SNNs through adaptive bit allocation. By balancing precision and resource consumption, the method reduces computational costs while maintaining high model performance, signifying a substantial advancement in spiking neural network design methodologies.