TCJA-SNN: Temporal-Channel Joint Attention for Spiking Neural Networks (2206.10177v3)
Abstract: Spiking Neural Networks (SNNs) are attracting widespread interest due to their biological plausibility, energy efficiency, and powerful spatio-temporal information representation ability. Given the critical role of attention mechanisms in enhancing neural network performance, the integration of SNNs and attention mechanisms exhibits potential to deliver energy-efficient and high-performance computing paradigms. We present a novel Temporal-Channel Joint Attention mechanism for SNNs, referred to as TCJA-SNN. The proposed TCJA-SNN framework can effectively assess the significance of spike sequence from both spatial and temporal dimensions. More specifically, our essential technical contribution lies on: 1) We employ the squeeze operation to compress the spike stream into an average matrix. Then, we leverage two local attention mechanisms based on efficient 1D convolutions to facilitate comprehensive feature extraction at the temporal and channel levels independently. 2) We introduce the Cross Convolutional Fusion (CCF) layer as a novel approach to model the inter-dependencies between the temporal and channel scopes. This layer breaks the independence of these two dimensions and enables the interaction between features. Experimental results demonstrate that the proposed TCJA-SNN outperforms SOTA by up to 15.7% accuracy on standard static and neuromorphic datasets, including Fashion-MNIST, CIFAR10-DVS, N-Caltech 101, and DVS128 Gesture. Furthermore, we apply the TCJA-SNN framework to image generation tasks by leveraging a variation autoencoder. To the best of our knowledge, this study is the first instance where the SNN-attention mechanism has been employed for image classification and generation tasks. Notably, our approach has achieved SOTA performance in both domains, establishing a significant advancement in the field. Codes are available at https://github.com/ridgerchu/TCJA.
- K. Roy, A. Jaiswal, and P. Panda, “Towards spike-based machine intelligence with neuromorphic computing,” Nature, vol. 575, no. 7784, pp. 607–617, 2019.
- E. Stromatias, D. Neil, M. Pfeiffer, F. Galluppi, S. B. Furber, and S.-C. Liu, “Robustness of spiking Deep Belief Networks to noise and reduced bit precision of neuro-inspired hardware platforms,” Frontiers in Neuroscience, vol. 9, p. 222, 2015.
- M. Zhang, J. Wang, J. Wu, A. Belatreche, B. Amornpaisannon, Z. Zhang, V. P. K. Miriyala, H. Qu, Y. Chua, T. E. Carlson et al., “Rectified linear postsynaptic potential function for backpropagation in deep spiking neural networks,” IEEE Transactions on Neural Networks and Learning Systems, vol. 33, no. 5, pp. 1947–1958, 2021.
- S. M. Bohte, J. N. Kok, and J. A. La Poutré, “Spikeprop: backpropagation for networks of spiking neurons.” in ESANN, vol. 48. Bruges, 2000, pp. 419–424.
- S. M. Bohté, J. N. Kok, and J. A. L. Poutré, “Error-backpropagation in Temporally Encoded Networks of Spiking Neurons,” Neurocomputing, vol. 48, no. 1-4, pp. 17–37, 2002.
- Y. Jin, W. Zhang, and P. Li, “Hybrid Macro/Micro Level Backpropagation for Training Deep Spiking Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 31, 2018.
- J. H. Lee, T. Delbruck, and M. Pfeiffer, “Training deep spiking neural networks using backpropagation,” Frontiers in neuroscience, vol. 10, p. 508, 2016.
- X. Luo, H. Qu, Y. Wang, Z. Yi, J. Zhang, and M. Zhang, “Supervised learning in multilayer spiking neural networks with spike temporal error backpropagation,” IEEE Transactions on Neural Networks and Learning Systems, 2022.
- H. Zheng, Y. Wu, L. Deng, Y. Hu, and G. Li, “Going Deeper With Directly-Trained Larger Spiking Neural Networks,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2021, pp. 11 062–11 070.
- Y. Hu, H. Tang, and G. Pan, “Spiking Deep Residual Networks,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–6, 2018.
- Y. Wu, L. Deng, G. Li, J. Zhu, Y. Xie, and L. Shi, “Direct Training for Spiking Neural Networks: Faster, Larger, Better,” in Proceedings of the AAAI Conference on Artificial Intelligence (AAAI), 2019, pp. 1311–1318.
- M. Yao, H. Gao, G. Zhao, D. Wang, Y. Lin, Z. Yang, and G. Li, “Temporal-wise Attention Spiking Neural Networks for Event Streams Classification,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 10 201–10 210.
- J. Hu, L. Shen, and G. Sun, “Squeeze-and-Excitation Networks,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2018.
- S. Woo, J. Park, J.-Y. Lee, and I. S. Kweon, “Cbam: Convolutional block attention module,” in Proceedings of the European conference on computer vision (ECCV), 2018, pp. 3–19.
- M. Bernert and B. Yvert, “An Attention-Based Spiking Neural Network for Unsupervised Spike-Sorting,” International Journal of Neural Systems, vol. 29, no. 8, pp. 1 850 059:1–1 850 059:19, 2019.
- W. Fang, Z. Yu, Y. Chen, T. Huang, T. Masquelier, and Y. Tian, “Deep Residual Learning in Spiking Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 34, 2021, pp. 21 056–21 069.
- C. Jin, R.-J. Zhu, X. Wu, and L.-J. Deng, “SIT: A Bionic and Non-Linear Neuron for Spiking Neural Network,” ArXiv preprint arXiv:2203.16117, 2022.
- E. O. Neftci, H. Mostafa, and F. Zenke, “Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks,” IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019.
- ——, “Surrogate Gradient Learning in Spiking Neural Networks: Bringing the Power of Gradient-Based Optimization to Spiking Neural Networks,” IEEE Signal Processing Magazine, vol. 36, no. 6, pp. 51–63, 2019.
- Y. Wu, L. Deng, G. Li, J. Zhu, and L. Shi, “Spatio-Temporal Backpropagation for Training High-performance Spiking Neural Networks,” Frontiers in neuroscience, vol. 12, p. 331, 2018.
- N. Rathi and K. Roy, “Diet-snn: A low-latency spiking neural network with direct input encoding and leakage and threshold optimization,” IEEE Transactions on Neural Networks and Learning Systems, 2021.
- J. K. Eshraghian, M. Ward, E. Neftci, X. Wang, G. Lenz, G. Dwivedi, M. Bennamoun, D. S. Jeong, and W. D. Lu, “Training Spiking Neural Networks Using Lessons From Deep Learning,” ArXiv, 2021.
- W. Fang, Z. Yu, Y. Chen, T. Masquelier, T. Huang, and Y. Tian, “Incorporating Learnable Membrane Time Constant To Enhance Learning of Spiking Neural Networks,” in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), 2021, pp. 2661–2671.
- G. Bellec, D. Salaj, A. Subramoney, R. Legenstein, and W. Maass, “Long short-term memory and learning-to-learn in networks of spiking neurons,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 31, 2018.
- V. Mante, V. Bonin, and M. Carandini, “Functional mechanisms shaping lateral geniculate responses to artificial and natural stimuli,” Neuron, vol. 58, no. 4, pp. 625–638, 2008.
- E. M. Izhikevich, “Simple model of spiking neurons,” IEEE Transactions on neural networks, vol. 14, no. 6, pp. 1569–1572, 2003.
- L. Lapique, “Recherches quantitatives sur l’excitation electrique des nerfs traitee comme une polarization,” Journal of Physiology and Pathology, vol. 9, pp. 620–635, 1907.
- S. Deng, Y. Li, S. Zhang, and S. Gu, “Temporal Efficient Training of Spiking Neural Network via Gradient Re-weighting,” in International Conference on Learning Representations (ICLR), 2021.
- H. Li, H. Liu, X. Ji, G. Li, and L. Shi, “CIFAR10-DVS: An Event-Stream Dataset for Object Classification,” Frontiers in Neuroscience, vol. 11, p. 309, 2017.
- Y. Li, Y. Guo, S. Zhang, S. Deng, Y. Hai, and S. Gu, “Differentiable Spike: Rethinking Gradient-Descent for Training Spiking Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 34, 2021, pp. 23 426–23 439.
- Y. Li, Y. Kim, H. Park, T. Geller, and P. Panda, “Neuromorphic Data Augmentation for Training Spiking Neural Networks,” ArXiv preprint arXiv:2203.06145, 2022.
- Q. Meng, M. Xiao, S. Yan, Y. Wang, Z. Lin, and Z.-Q. Luo, “Training High-Performance Low-Latency Spiking Neural Networks by Differentiation on Spike Representation,” ArXiv preprint arXiv:2205.00459, 2022.
- G. Orchard, A. Jayawant, G. K. Cohen, and N. Thakor, “Converting Static Image Datasets to Spiking Neuromorphic Datasets Using Saccades,” Frontiers in Neuroscience, vol. 9, p. 437, 2015.
- L. Fei-Fei, R. Fergus, and P. Perona, “Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories,” in 2004 Conference on Computer Vision and Pattern Recognition (CVPR) Workshop, 2004, pp. 178–178.
- A. Amir, B. Taba, D. Berg, and et al., “A Low Power, Fully Event-Based Gesture Recognition System,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017.
- H. Xiao, K. Rasul, and R. Vollgraf, “Fashion-mnist: a Novel Image Dataset for Benchmarking Machine Learning Algorithms,” ArXiv preprint arXiv:1708.07747, 2017.
- A. Kugele, T. Pfeil, M. Pfeiffer, and E. Chicca, “Efficient Processing of Spatio-temporal Data Streams with Spiking Neural Networks,” Frontiers in Neuroscience, vol. 14, p. 439, 2020.
- H. Zhang, M. Cisse, Y. N. Dauphin, and D. Lopez-Paz, “mixup: Beyond Empirical Risk Minimization,” in International Conference on Learning Representations (ICLR), 2018.
- N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, and R. Salakhutdinov, “Dropout: A simple way to prevent neural networks from overfitting,” The Journal of Machine Learning Research, vol. 15, no. 1, pp. 1929–1958, 2014.
- W. Fang, Y. Chen, J. Ding, D. Chen, Z. Yu, H. Zhou, Y. Tian, and other contributors, “Spikingjelly,” https://github.com/fangwei123456/spikingjelly, 2020, accessed: 2022-05-04.
- A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Kopf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. Bai, and S. Chintala, “PyTorch: An Imperative Style, High-Performance Deep Learning Library,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
- Z. Wu, H. Zhang, Y. Lin, G. Li, M. Wang, and Y. Tang, “LIAF-Net: Leaky Integrate and Analog Fire Network for Lightweight and Efficient Spatiotemporal Information Processing,” IEEE Transactions on Neural Networks and Learning Systems, pp. 1–14, 2021.
- S. B. Shrestha and G. Orchard, “SLAYER: Spike Layer Error Reassignment in Time,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 31, 2018.
- A. Sironi, M. Brambilla, N. Bourdis, X. Lagorce, and R. Benosman, “HATS: Histograms of Averaged Time Surfaces for Robust Event-Based Object Classification,” in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2018, pp. 1731–1740.
- B. Ramesh, H. Yang, G. Orchard, N. A. Le Thi, S. Zhang, and C. Xiang, “DART: Distribution Aware Retinal Transform for Event-Based Cameras,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 42, no. 11, pp. 2767–2780, 2019.
- J. Kaiser, H. Mostafa, and E. Neftci, “Synaptic Plasticity Dynamics for Deep Continuous Local Learning (DECOLLE),” Frontiers in Neuroscience, vol. 14, p. 424, 2020.
- Y. Kim and P. Panda, “Optimizing Deeper Spiking Neural Networks for Dynamic Vision Sensing,” Neural Networks, vol. 144, pp. 686–698, 2021.
- W. Zhang and P. Li, “Spike-Train Level Backpropagation for Training Deep Recurrent Spiking Neural Networks,” in Advances in Neural Information Processing Systems (NeurIPS), vol. 32, 2019.
- X. Cheng, Y. Hao, J. Xu, and B. Xu, “LISNN: Improving Spiking Neural Networks with Lateral Interactions for Robust Object Recognition,” in International Joint Conference on Artificial Intelligence (IJCAI), 2020, pp. 1519–1525.
- I. Loshchilov and F. Hutter, “Decoupled weight decay regularization,” arXiv preprint arXiv:1711.05101, 2017.
- H. Kamata, Y. Mukuta, and T. Harada, “Fully spiking variational autoencoder,” in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 36, no. 6, 2022, pp. 7059–7067.
- M. Horowitz, “1.1 computing’s energy problem (and what we can do about it),” in 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC). IEEE, 2014, pp. 10–14.
- Rui-Jie Zhu (20 papers)
- Qihang Zhao (9 papers)
- Haoyu Deng (6 papers)
- Yule Duan (8 papers)
- Malu Zhang (43 papers)
- Liang-jian Deng (32 papers)