AutoST: Training-free Neural Architecture Search for Spiking Transformers (2307.00293v2)
Abstract: Spiking Transformers have gained considerable attention because they achieve both the energy efficiency of Spiking Neural Networks (SNNs) and the high capacity of Transformers. However, the existing Spiking Transformer architectures, derived from Artificial Neural Networks (ANNs), exhibit a notable architectural gap, resulting in suboptimal performance compared to their ANN counterparts. Manually discovering optimal architectures is time-consuming. To address these limitations, we introduce AutoST, a training-free NAS method for Spiking Transformers, to rapidly identify high-performance Spiking Transformer architectures. Unlike existing training-free NAS methods, which struggle with the non-differentiability and high sparsity inherent in SNNs, we propose to utilize Floating-Point Operations (FLOPs) as a performance metric, which is independent of model computations and training dynamics, leading to a stronger correlation with performance. Our extensive experiments show that AutoST models outperform state-of-the-art manually or automatically designed SNN architectures on static and neuromorphic datasets. Full code, model, and data are released for reproduction.
- Wolfgang Maass, “Networks of spiking neurons: The third generation of neural network models,” Neural networks, vol. 10, no. 9, pp. 1659–1671, 1997.
- “An image is worth 16x16 words: Transformers for image recognition at scale,” arXiv preprint arXiv:2010.11929, 2020.
- “Swin transformer: Hierarchical vision transformer using shifted windows,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 10012–10022.
- “Spikformer: When Spiking Neural Network Meets Transformer,” arXiv preprint arXiv:2209.15425, 2022.
- “Efficient Spiking Transformer Enabled By Partial Information,” Oct. 2022, arXiv:2210.01208 [cs].
- “Neural architecture search for spiking neural networks,” in Computer Vision–ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXIV. 2022, pp. 36–56, Springer.
- “AutoSNN: Towards energy-efficient spiking neural networks,” in International Conference on Machine Learning. 2022, pp. 16253–16269, PMLR.
- “Training-free transformer architecture search,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 10894–10903.
- “Proxylessnas: Direct neural architecture search on target task and hardware,” arXiv preprint arXiv:1812.00332, 2018.
- “Pruning neural networks without any data by iteratively conserving synaptic flow,” Advances in neural information processing systems, vol. 33, pp. 6377–6389, 2020.
- “Snip: Single-shot network pruning based on connection sensitivity,” arXiv preprint arXiv:1810.02340, 2018.
- “Neural tangent kernel: Convergence and generalization in neural networks,” Advances in neural information processing systems, vol. 31, 2018.
- “Neural architecture search without training,” in International Conference on Machine Learning. 2021, pp. 7588–7598, PMLR.
- “Neuromorphic vision datasets for pedestrian detection, action recognition, and fall detection,” Frontiers in neurorobotics, vol. 13, pp. 38, 2019.
- “Optimal ANN-SNN Conversion for High-accuracy and Ultra-low-latency Spiking Neural Networks,” in International Conference on Learning Representations, 2021.
- “Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting,” arXiv preprint arXiv:2202.11946, 2022.
- “Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12444–12453.
- “DIET-SNN: Direct Input Encoding With Leakage and Threshold Optimization in Deep Spiking Neural Networks,” Dec. 2020.
- “Rmp-snn: Residual membrane potential neuron for enabling deeper high-accuracy and low-latency spiking neural network,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020, pp. 13558–13567.
- “Going deeper with directly-trained larger spiking neural networks,” in Proceedings of the AAAI Conference on Artificial Intelligence, 2021, vol. 35, pp. 11062–11070.
- “Liaf-net: Leaky integrate and analog fire network for lightweight and efficient spatiotemporal information processing,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 11, pp. 6249–6262, 2021.
- “Incorporating learnable membrane time constant to enhance learning of spiking neural networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021, pp. 2661–2671.
- “Differentiable spike: Rethinking gradient-descent for training spiking neural networks,” Advances in Neural Information Processing Systems, vol. 34, pp. 23426–23439, 2021.
- “Going Deeper With Directly-Trained Larger Spiking Neural Networks,” Dec. 2020.
- “Scaling vision transformers,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022, pp. 12104–12113.